Search Results: "angel"

11 May 2023

Simon Josefsson: Streamlined NTRU Prime sntrup761 goes to IETF

The OpenSSH project added support for a hybrid Streamlined NTRU Prime post-quantum key encapsulation method sntrup761 to strengthen their X25519-based default in their version 8.5 released on 2021-03-03. While there has been a lot of talk about post-quantum crypto generally, my impression has been that there has been a slowdown in implementing and deploying them in the past two years. Why is that? Regardless of the answer, we can try to collaboratively change things, and one effort that appears strangely missing are IETF documents for these algorithms. Building on some earlier work that added X25519/X448 to SSH, writing a similar document was relatively straight-forward once I had spent a day reading OpenSSH and TinySSH source code to understand how it worked. While I am not perfectly happy with how the final key is derived from the sntrup761/X25519 secrets it is a SHA512 call on the concatenated secrets I think the construct deserves to be better documented, to pave the road for increased confidence or better designs. Also, reusing the RFC5656 4 structs makes for a worse specification (one unnecessary normative reference), but probably a simpler implementation. I have published draft-josefsson-ntruprime-ssh-00 here. Credit here goes to Jan Moj of TinySSH that designed the earlier sntrup4591761x25519-sha512@tinyssh.org in 2018, Markus Friedl who added it to OpenSSH in 2019, and Damien Miller that changed it to sntrup761 in 2020. Does anyone have more to add to the history of this work? Once I had sharpened my xml2rfc skills, preparing a document describing the hybrid construct between the sntrup761 key-encapsulation mechanism and the X25519 key agreement method in a non-SSH fashion was easy. I do not know if this work is useful, but it may serve as a reference for further study. I published draft-josefsson-ntruprime-hybrid-00 here. Finally, how about a IETF document on the base Streamlined NTRU Prime? Explaining all the details, and especially the math behind it would be a significant effort. I started doing that, but realized it is a subjective call when to stop explaining things. If we can t assume that the reader knows about lattice math, is a document like this the best place to teach it? I settled for the most minimal approach instead, merely giving an introduction to the algorithm, included SageMath and C reference implementations together with test vectors. The IETF audience rarely understands math, so I think it is better to focus on the bits on the wire and the algorithm interfaces. Everything here was created by the Streamlined NTRU Prime team, I merely modified it a bit hoping I didn t break too much. I have now published draft-josefsson-ntruprime-streamlined-00 here. I maintain the IETF documents on my ietf-ntruprime GitLab page, feel free to open merge requests or raise issues to help improve them. To have confidence in the code was working properly, I ended up preparing a branch with sntrup761 for the GNU-project Nettle and have submitted it upstream for review. I had the misfortune of having to understand and implement NIST s DRBG-CTR to compute the sntrup761 known-answer tests, and what a mess it is. Why does a deterministic random generator support re-seeding? Why does it support non-full entropy derivation? What s with the key size vs block size confusion? What s with the optional parameters? What s with having multiple algorithm descriptions? Luckily I was able to extract a minimal but working implementation that is easy to read. I can t locate DRBG-CTR test vectors, anyone? Does anyone have sntrup761 test vectors that doesn t use DRBG-CTR? One final reflection on publishing known-answer tests for an algorithm that uses random data: are the test vectors stable over different ways to implement the algorithm? Just consider of some optimization moved one randomness-extraction call before another, then wouldn t the output be different? Are there other ways to verify correctness of implementations? As always, happy hacking!

3 May 2023

John Goerzen: Martha the Pilot

Martha, now 5, can t remember a time when she didn t fly periodically. She s come along in our airplane in short flights to a nearby restaurant and long ones to Michigan and South Dakota. All this time, she s been riding in the back seat next to Laura. Martha has been talking excitedly about riding up front next to me. She wants to be my co-pilot . I promised to give her an airplane wing pin when she did one I got from a pilot of a commercial flight when I was a kid. Of course, safety was first, so I wanted to be sure she was old enough to fly there without being a distraction. Last weekend, the moment finally arrived. She was so excited! She brought along her Claire bear aviator, one that I bought for her at an airport a little while back. She buckled in two of her dolls in the back seat. Martha's dolls And then up we went! Martha in the airplane Martha was so proud when we landed! We went to Stearman Field, just a short 10-minute flight away, and parked the plane right in front of the restaurant. We flew back, and Martha thought we should get a photo of her standing on the wing by the door. Great idea! Martha standing on the wing She was happily jabbering about the flight all the way home. She told us several times about the pin she got, watching out the window, watching all the screens in the airplane, and also that she didn t get sick at all despite some turbulence. And, she says, Now just you and I can go flying! Yes, that s something I m looking forward to!

29 April 2023

Russ Allbery: INN 2.7.1

This is a bug fix and minor feature release over INN 2.7.0, and the upgrade should be painless. You can download the new release from ISC or my personal INN pages. The latter also has links to the full changelog and the other INN documentation. As of this release, we're no longer generating hashes and signed hashes. Instead, the release is a simple tarball and a detached GnuPG signature, similar to my other software releases. We're also maintaining the releases in parallel on GitHub. For the full list of changes, see the INN 2.7.1 NEWS file. As always, thanks to Julien LIE for preparing this release and doing most of the maintenance work on INN!

28 April 2023

Sven Hoexter: What's wrong in IT: commit messages

In my day job someone today took the time in the team daily to explain his research why some of our configuration is wrong. He spent quite some time on his own to look at the history in git and how everything was setup initially, and ended up in the current - wrong - way. That triggered me to validate that quickly, another 5min of work. So we agreed to change it. A one line change, nothing spectacular, but lifetime was invested to figure out why it should've a different value. When the pull request got opened a few minutes later there was nothing of that story in the commit message. Zero, nada, nothing. :( I'm really puzzled why someone invests lifetime to dig into company internal history to try get something right, do a lengthy explanation to the whole team, use the time of others, even mention that there was no explanation of why it's not the default value anymore it should be, and repeat the same mistake by not writing down anything in the commit message. For the current company I'm inclined to propose a commit message validator. For a potential future company I might join, I guess I ask for real world git logs from repositories I should contribute to. Seems that this is another valuable source of information to qualify the company culture. Next up to the existence of whiteboards in the office. I'm really happy that at least a majority of the people contributing to Debian writes somewhat decent commit messages and changelogs. Let that be a reminder to myself to improve in that area the next time I've to change something.

14 April 2023

John Goerzen: Easily Accessing All Your Stuff with a Zero-Trust Mesh VPN

Probably everyone is familiar with a regular VPN. The traditional use case is to connect to a corporate or home network from a remote location, and access services as if you were there. But these days, the notion of corporate network and home network are less based around physical location. For instance, a company may have no particular office at all, may have a number of offices plus a number of people working remotely, and so forth. A home network might have, say, a PVR and file server, while highly portable devices such as laptops, tablets, and phones may want to talk to each other regardless of location. For instance, a family member might be traveling with a laptop, another at a coffee shop, and those two devices might want to communicate, in addition to talking to the devices at home. And, in both scenarios, there might be questions about giving limited access to friends. Perhaps you d like to give a friend access to part of your file server, or as a company, you might have contractors working on a limited project. Pretty soon you wind up with a mess of VPNs, forwarded ports, and tricks to make it all work. With the increasing prevalence of CGNAT, a lot of times you can t even open a port to the public Internet. Each application or device probably has its own gateway just to make it visible on the Internet, some of which you pay for. Then you add on the question of: should you really trust your LAN anyhow? With possibilities of guests using it, rogue access points, etc., the answer is probably no . We can move the responsibility for dealing with NAT, fluctuating IPs, encryption, and authentication, from the application layer further down into the network stack. We then arrive at a much simpler picture for all. So this page is fundamentally about making the network work, simply and effectively.

How do we make the Internet work in these scenarios? We re going to combine three concepts:
  1. A VPN, providing fully encrypted and authenticated communication and stable IPs
  2. Mesh Networking, in which devices automatically discover optimal paths to reach each other
  3. Zero-trust networking, in which we do not need to trust anything about the underlying LAN, because all our traffic uses the secure systems in points 1 and 2.
By combining these concepts, we arrive at some nice results:
  • You can ssh hostname, where hostname is one of your machines (server, laptop, whatever), and as long as hostname is up, you can reach it, wherever it is, wherever you are.
    • Combined with mosh, these sessions will be durable even across moving to other host networks.
    • You could just as well use telnet, because the underlying network should be secure.
  • You don t have to mess with encryption keys, certs, etc., for every internal-only service. Since IPs are now trustworthy, that s all you need. hosts.allow could make a comeback!
  • You have a way of transiting out of extremely restrictive networks. Every tool discussed here has a way of falling back on routing things via a broker (relay) on TCP port 443 if all else fails.
There might sometimes be tradeoffs. For instance:
  • On LANs faster than 1Gbps, performance may degrade due to encryption and encapsulation overhead. However, these tools should let hosts discover the locality of each other and not send traffic over the Internet if the devices are local.
  • With some of these tools, hosts local to each other (on the same LAN) may be unable to find each other if they can t reach the control plane over the Internet (Internet is down or provider is down)
Some other features that some of the tools provide include:
  • Easy sharing of limited access with friends/guests
  • Taking care of everything you need, including SSL certs, for exposing a certain on-net service to the public Internet
  • Optional routing of your outbound Internet traffic via an exit node on your network. Useful, for instance, if your local network is blocking tons of stuff.
Let s dive in.

Types of Mesh VPNs I ll go over several types of meshes in this article:
  1. Fully decentralized with automatic hop routing This model has no special central control plane. Nodes discover each other in various ways, and establish routes to each other. These routes can be direct connections over the Internet, or via other nodes. This approach offers the greatest resilience. Examples I ll cover include Yggdrasil and tinc.
  2. Automatic peer-to-peer with centralized control In this model, nodes, by default, communicate by establishing direct links between them. A regular node never carries traffic on behalf of other nodes. Special-purpose relays are used to handle cases in which NAT traversal is impossible. This approach tends to offer simple setup. Examples I ll cover include Tailscale, Zerotier, Nebula, and Netmaker.
  3. Roll your own and hybrid approaches This is a grab bag of other ideas; for instance, running Yggdrasil over Tailscale.

Terminology For the sake of consistency, I m going to use common language to discuss things that have different terms in different ecosystems:
  • Every tool discussed here has a way of dealing with NAT traversal. It may assist with establishing direct connections (eg, STUN), and if that fails, it may simply relay traffic between nodes. I ll call such a relay a broker . This may or may not be the same system that is a control plane for a tool.
  • All of these systems operate over lower layers that are unencrypted. Those lower layers may be a LAN (wired or wireless, which may or may not have Internet access), or the public Internet (IPv4 and/or IPv6). I m going to call the unencrypted lower layer, whatever it is, the clearnet .

Evaluation Criteria Here are the things I want to see from a solution:
  • Secure, with all communications end-to-end encrypted and authenticated, and prevention of traffic from untrusted devices.
  • Flexible, adapting to changes in network topology quickly and automatically.
  • Resilient, without single points of failure, and with devices local to each other able to communicate even if cut off from the Internet or other parts of the network.
  • Private, minimizing leakage of information or metadata about me and my systems
  • Able to traverse CGNAT without having to use a broker whenever possible
  • A lesser requirement for me, but still a nice to have, is the ability to include others via something like Internet publishing or inviting guests.
  • Fully or nearly fully Open Source
  • Free or very cheap for personal use
  • Wide operating system support, including headless Linux on x86_64 and ARM.

Fully Decentralized VPNs with Automatic Hop Routing Two systems fit this description: Yggdrasil and Tinc. Let s dive in.

Yggdrasil I ll start with Yggdrasil because I ve written so much about it already. It featured in prior posts such as:

Yggdrasil can be a private mesh VPN, or something more Yggdrasil can be a private mesh VPN, just like the other tools covered here. It s unique, however, in that a key goal of the project is to also make it useful as a planet-scale global mesh network. As such, Yggdrasil is a testbed of new ideas in distributed routing designed to scale up to massive sizes and all sorts of connection conditions. As of 2023-04-10, the main global Yggdrasil mesh has over 5000 nodes in it. You can choose whether or not to participate. Every node in a Yggdrasil mesh has a public/private keypair. Each node then has an IPv6 address (in a private address space) derived from its public key. Using these IPv6 addresses, you can communicate right away. Yggdrasil differs from most of the other tools here in that it does not necessarily seek to establish a direct link on the clearnet between, say, host A and host G for them to communicate. It will prefer such a direct link if it exists, but it is perfectly happy if it doesn t. The reason is that every Yggdrasil node is also a router in the Yggdrasil mesh. Let s sit with that concept for a moment. Consider:
  • If you have a bunch of machines on your LAN, but only one of them can peer over the clearnet, that s fine; all the other machines will discover this route to the world and use it when necessary.
  • All you need to run a broker is just a regular node with a public IP address. If you are participating in the global mesh, you can use one (or more) of the free public peers for this purpose.
  • It is not necessary for every node to know about the clearnet IP address of every other node (improving privacy). In fact, it s not even necessary for every node to know about the existence of all the other nodes, so long as it can find a route to a given node when it s asked to.
  • Yggdrasil can find one or more routes between nodes, and it can use this knowledge of multiple routes to aggressively optimize for varying network conditions, including combinations of, say, downloads and low-latency ssh sessions.
Behind the scenes, Yggdrasil calculates optimal routes between nodes as necessary, using a mesh-wide DHT for initial contact and then deriving more optimal paths. (You can also read more details about the routing algorithm.) One final way that Yggdrasil is different from most of the other tools is that there is no separate control server. No node is special , in charge, the sole keeper of metadata, or anything like that. The entire system is completely distributed and auto-assembling.

Meeting neighbors There are two ways that Yggdrasil knows about peers:
  • By broadcast discovery on the local LAN
  • By listening on a specific port (or being told to connect to a specific host/port)
Sometimes this might lead to multiple ways to connect to a node; Yggdrasil prefers the connection auto-discovered by broadcast first, then the lowest-latency of the defined path. In other words, when your laptops are in the same room as each other on your local LAN, your packets will flow directly between them without traversing the Internet.

Unique uses Yggdrasil is uniquely suited to network-challenged situations. As an example, in a post-disaster situation, Internet access may be unavailable or flaky, yet there may be many local devices perhaps ones that had never known of each other before that could share information. Yggdrasil meets this situation perfectly. The combination of broadcast auto-detection, distributed routing, and so forth, basically means that if there is any physical path between two nodes, Yggdrasil will find and enable it. Ad-hoc wifi is rarely used because it is a real pain. Yggdrasil actually makes it useful! Its broadcast discovery doesn t require any IP address provisioned on the interface at all (it just uses the IPv6 link-local address), so you don t need to figure out a DHCP server or some such. And, Yggdrasil will tend to perform routing along the contours of the RF path. So you could have a laptop in the middle of a long distance relaying communications from people farther out, because it could see both. Or even a chain of such things.

Yggdrasil: Security and Privacy Yggdrasil s mesh is aggressively greedy. It will peer with any node it can find (unless told otherwise) and will find a route to anywhere it can. There are two main ways to make sure you keep unauthorized traffic out: by restricting who can talk to your mesh, and by firewalling the Yggdrasil interface. Both can be used, and they can be used simultaneously. I ll discuss firewalling more at the end of this article. Basically, you ll almost certainly want to do this if you participate in the public mesh, because doing so is akin to having a globally-routable public IP address direct to your device. If you want to restrict who can talk to your mesh, you just disable the broadcast feature on all your nodes (empty MulticastInterfaces section in the config), and avoid telling any of your nodes to connect to a public peer. You can set a list of authorized public keys that can connect to your nodes listening interfaces, which you ll probably want to do. You will probably want to either open up some inbound ports (if you can) or set up a node with a known clearnet IP on a place like a $5/mo VPS to help with NAT traversal (again, setting AllowedPublicKeys as appropriate). Yggdrasil doesn t allow filtering multicast clients by public key, only by network interface, so that s why we disable broadcast discovery. You can easily enough teach Yggdrasil about static internal LAN IPs of your nodes and have things work that way. (Or, set up an internal gateway node or two, that the clients just connect to when they re local). But fundamentally, you need to put a bit more thought into this with Yggdrasil than with the other tools here, which are closed-only. Compared to some of the other tools here, Yggdrasil is better about information leakage; nodes only know details, such as clearnet IPs, of directly-connected peers. You can obtain the list of directly-connected peers of any known node in the mesh but that list is the public keys of the directly-connected peers, not the clearnet IPs. Some of the other tools contain a limited integrated firewall of sorts (with limited ACLs and such). Yggdrasil does not, but is fully compatible with on-host firewalls. I recommend these anyway even with many other tools.

Yggdrasil: Connectivity and NAT traversal Compared to the other tools, Yggdrasil is an interesting mix. It provides a fully functional mesh and facilitates connectivity in situations in which no other tool can. Yet its NAT traversal, while it exists and does work, results in using a broker under some of the more challenging CGNAT situations more often than some of the other tools, which can impede performance. Yggdrasil s underlying protocol is TCP-based. Before you run away screaming that it must be slow and unreliable like OpenVPN over TCP it s not, and it is even surprisingly good around bufferbloat. I ve found its performance to be on par with the other tools here, and it works as well as I d expect even on flaky 4G links. Overall, the NAT traversal story is mixed. On the one hand, you can run a node that listens on port 443 and Yggdrasil can even make it speak TLS (even though that s unnecessary from a security standpoint), so you can likely get out of most restrictive firewalls you will ever encounter. If you join the public mesh, know that plenty of public peers do listen on port 443 (and other well-known ports like 53, plus random high-numbered ones). If you connect your system to multiple public peers, there is a chance though a very small one that some public transit traffic might be routed via it. In practice, public peers hopefully are already peered with each other, preventing this from happening (you can verify this with yggdrasilctl debug_remotegetpeers key=ABC...). I have never experienced a problem with this. Also, since latency is a factor in routing for Yggdrasil, it is highly unlikely that random connections we use are going to be competitive with datacenter peers.

Yggdrasil: Sharing with friends If you re open to participating in the public mesh, this is one of the easiest things of all. Have your friend install Yggdrasil, point them to a public peer, give them your Yggdrasil IP, and that s it. (Well, presumably you also open up your firewall you did follow my advice to set one up, right?) If your friend is visiting at your location, they can just hop on your wifi, install Yggdrasil, and it will automatically discover a route to you. Yggdrasil even has a zero-config mode for ephemeral nodes such as certain Docker containers. Yggdrasil doesn t directly support publishing to the clearnet, but it is certainly possible to proxy (or even NAT) to/from the clearnet, and people do.

Yggdrasil: DNS There is no particular extra DNS in Yggdrasil. You can, of course, run a DNS server within Yggdrasil, just as you can anywhere else. Personally I just add relevant hosts to /etc/hosts and leave it at that, but it s up to you.

Yggdrasil: Source code, pricing, and portability Yggdrasil is fully open source (LGPLv3 plus additional permissions in an exception) and highly portable. It is written in Go, and has prebuilt binaries for all major platforms (including a Debian package which I made). There is no charge for anything with Yggdrasil. Listed public peers are free and run by volunteers. You can run your own peers if you like; they can be public and unlisted, public and listed (just submit a PR to get it listed), or private (accepting connections only from certain nodes keys). A peer in this case is just a node with a known clearnet IP address. Yggdrasil encourages use in other projects. For instance, NNCP integrates a Yggdrasil node for easy communication with other NNCP nodes.

Yggdrasil conclusions Yggdrasil is tops in reliability (having no single point of failure) and flexibility. It will maintain opportunistic connections between peers even if the Internet is down. The unique added feature of being able to be part of a global mesh is a nice one. The tradeoffs include being more prone to need to use a broker in restrictive CGNAT environments. Some other tools have clients that override the OS DNS resolver to also provide resolution of hostnames of member nodes; Yggdrasil doesn t, though you can certainly run your own DNS infrastructure over Yggdrasil (or, for that matter, let public DNS servers provide Yggdrasil answers if you wish). There is also a need to pay more attention to firewalling or maintaining separation from the public mesh. However, as I explain below, many other options have potential impacts if the control plane, or your account for it, are compromised, meaning you ought to firewall those, too. Still, it may be a more immediate concern with Yggdrasil. Although Yggdrasil is listed as experimental, I have been using it for over a year and have found it to be rock-solid. They did change how mesh IPs were calculated when moving from 0.3 to 0.4, causing a global renumbering, so just be aware that this is a possibility while it is experimental.

tinc tinc is the oldest tool on this list; version 1.0 came out in 2003! You can think of tinc as something akin to an older Yggdrasil without the public option. I will be discussing tinc 1.0.36, the latest stable version, which came out in 2019. The development branch, 1.1, has been going since 2011 and had its latest release in 2021. The last commit to the Github repo was in June 2022. Tinc is the only tool here to support both tun and tap style interfaces. I go into the difference more in the Zerotier review below. Tinc actually provides a better tap implementation than Zerotier, with various sane options for broadcasts, but I still think the call for an Ethernet, as opposed to IP, VPN is small. To configure tinc, you generate a per-host configuration and then distribute it to every tinc node. It contains a host s public key. Therefore, adding a host to the mesh means distributing its key everywhere; de-authorizing it means removing its key everywhere. This makes it rather unwieldy. tinc can do LAN broadcast discovery and mesh routing, but generally speaking you must manually teach it where to connect initially. Somewhat confusingly, the examples all mention listing a public address for a node. This doesn t make sense for a laptop, and I suspect you d just omit it. I think that address is used for something akin to a Yggdrasil peer with a clearnet IP. Unlike all of the other tools described here, tinc has no tool to inspect the running state of the mesh. Some of the properties of tinc made it clear I was unlikely to adopt it, so this review wasn t as thorough as that of Yggdrasil.

tinc: Security and Privacy As mentioned above, every host in the tinc mesh is authenticated based on its public key. However, to be more precise, this key is validated only at the point it connects to its next hop peer. (To be sure, this is also the same as how the list of allowed pubkeys works in Yggdrasil.) Since IPs in tinc are not derived from their key, and any host can assign itself whatever mesh IP it likes, this implies that a compromised host could impersonate another. It is unclear whether packets are end-to-end encrypted when using a tinc node as a router. The fact that they can be routed at the kernel level by the tun interface implies that they may not be.

tinc: Connectivity and NAT traversal I was unable to find much information about NAT traversal in tinc, other than that it does support it. tinc can run over UDP or TCP and auto-detects which to use, preferring UDP.

tinc: Sharing with friends tinc has no special support for this, and the difficulty of configuration makes it unlikely you d do this with tinc.

tinc: Source code, pricing, and portability tinc is fully open source (GPLv2). It is written in C and generally portable. It supports some very old operating systems. Mobile support is iffy. tinc does not seem to be very actively maintained.

tinc conclusions I haven t mentioned performance in my other reviews (see the section at the end of this post). But, it is so poor as to only run about 300Mbps on my 2.5Gbps network. That s 1/3 the speed of Yggdrasil or Tailscale. Combine that with the unwieldiness of adding hosts and some uncertainties in security, and I m not going to be using tinc.

Automatic Peer-to-Peer Mesh VPNs with centralized control These tend to be the options that are frequently discussed. Let s talk about the options.

Tailscale Tailscale is a popular choice in this type of VPN. To use Tailscale, you first sign up on tailscale.com. Then, you install the tailscale client on each machine. On first run, it prints a URL for you to click on to authorize the client to your mesh ( tailnet ). Tailscale assigns a mesh IP to each system. The Tailscale client lets the Tailscale control plane gather IP information about each node, including all detectable public and private clearnet IPs. When you attempt to contact a node via Tailscale, the client will fetch the known contact information from the control plane and attempt to establish a link. If it can contact over the local LAN, it will (it doesn t have broadcast autodetection like Yggdrasil; the information must come from the control plane). Otherwise, it will try various NAT traversal options. If all else fails, it will use a broker to relay traffic; Tailscale calls a broker a DERP relay server. Unlike Yggdrasil, a Tailscale node never relays traffic for another; all connections are either direct P2P or via a broker. Tailscale, like several others, is based around Wireguard; though wireguard-go rather than the in-kernel Wireguard. Tailscale has a number of somewhat unique features in this space:
  • Funnel, which lets you expose ports on your system to the public Internet via the VPN.
  • Exit nodes, which automate the process of routing your public Internet traffic over some other node in the network. This is possible with every tool mentioned here, but Tailscale makes switching it on or off a couple of quick commands away.
  • Node sharing, which lets you share a subset of your network with guests
  • A fantastic set of documentation, easily the best of the bunch.
Funnel, in particular, is interesting. With a couple of tailscale serve -style commands, you can expose a directory tree (or a development webserver) to the world. Tailscale gives you a public hostname, obtains a cert for it, and proxies inbound traffic to you. This is subject to some unspecified bandwidth limits, and you can only choose from three public ports, so it s not really a production solution but as a quick and easy way to demonstrate something cool to a friend, it s a neat feature.

Tailscale: Security and Privacy With Tailscale, as with the other tools in this category, one of the main threats to consider is the control plane. What are the consequences of a compromise of Tailscale s control plane, or of the credentials you use to access it? Let s begin with the credentials used to access it. Tailscale operates no identity system itself, instead relying on third parties. For individuals, this means Google, Github, or Microsoft accounts; Okta and other SAML and similar identity providers are also supported, but this runs into complexity and expense that most individuals aren t wanting to take on. Unfortunately, all three of those types of accounts often have saved auth tokens in a browser. Personally I would rather have a separate, very secure, login. If a person does compromise your account or the Tailscale servers themselves, they can t directly eavesdrop on your traffic because it is end-to-end encrypted. However, assuming an attacker obtains access to your account, they could:
  • Tamper with your Tailscale ACLs, permitting new actions
  • Add new nodes to the network
  • Forcibly remove nodes from the network
  • Enable or disable optional features
Of note is that they cannot just commandeer an existing IP. I would say the riskiest possibility here is that could add new nodes to the mesh. Because they could also tamper with your ACLs, they could then proceed to attempt to access all your internal services. They could even turn on service collection and have Tailscale tell them what and where all the services are. Therefore, as with other tools, I recommend a local firewall on each machine with Tailscale. More on that below. Tailscale has a new alpha feature called tailnet lock which helps with this problem. It requires existing nodes in the mesh to sign a request for a new node to join. Although this doesn t address ACL tampering and some of the other things, it does represent a significant help with the most significant concern. However, tailnet lock is in alpha, only available on the Enterprise plan, and has a waitlist, so I have been unable to test it. Any Tailscale node can request the IP addresses belonging to any other Tailscale node. The Tailscale control plane captures, and exposes to you, this information about every node in your network: the OS hostname, IP addresses and port numbers, operating system, creation date, last seen timestamp, and NAT traversal parameters. You can optionally enable service data capture as well, which sends data about open ports on each node to the control plane. Tailscale likes to highlight their key expiry and rotation feature. By default, all keys expire after 180 days, and traffic to and from the expired node will be interrupted until they are renewed (basically, you re-login with your provider and do a renew operation). Unfortunately, the only mention I can see of warning of impeding expiration is in the Windows client, and even there you need to edit a registry key to get the warning more than the default 24 hours in advance. In short, it seems likely to cut off communications when it s most important. You can disable key expiry on a per-node basis in the admin console web interface, and I mostly do, due to not wanting to lose connectivity at an inopportune time.

Tailscale: Connectivity and NAT traversal When thinking about reliability, the primary consideration here is being able to reach the Tailscale control plane. While it is possible in limited circumstances to reach nodes without the Tailscale control plane, it is a fairly brittle setup and notably will not survive a client restart. So if you use Tailscale to reach other nodes on your LAN, that won t work unless your Internet is up and the control plane is reachable. Assuming your Internet is up and Tailscale s infrastructure is up, there is little to be concerned with. Your own comfort level with cloud providers and your Internet should guide you here. Tailscale wrote a fantastic article about NAT traversal and they, predictably, do very well with it. Tailscale prefers UDP but falls back to TCP if needed. Broker (DERP) servers step in as a last resort, and Tailscale clients automatically select the best ones. I m not aware of anything that is more successful with NAT traversal than Tailscale. This maximizes the situations in which a direct P2P connection can be used without a broker. I have found Tailscale to be a bit slow to notice changes in network topography compared to Yggdrasil, and sometimes needs a kick in the form of restarting the client process to re-establish communications after a network change. However, it s possible (maybe even probable) that if I d waited a bit longer, it would have sorted this all out.

Tailscale: Sharing with friends I touched on the funnel feature earlier. The sharing feature lets you give an invite to an outsider. By default, a person accepting a share can make only outgoing connections to the network they re invited to, and cannot receive incoming connections from that network this makes sense. When sharing an exit node, you get a checkbox that lets you share access to the exit node as well. Of course, the person accepting the share needs to install the Tailnet client. The combination of funnel and sharing make Tailscale the best for ad-hoc sharing.

Tailscale: DNS Tailscale s DNS is called MagicDNS. It runs as a layer atop your standard DNS taking over /etc/resolv.conf on Linux and provides resolution of mesh hostnames and some other features. This is a concept that is pretty slick. It also is a bit flaky on Linux; dueling programs want to write to /etc/resolv.conf. I can t really say this is entirely Tailscale s fault; they document the problem and some workarounds. I would love to be able to add custom records to this service; for instance, to override the public IP for a service to use the in-mesh IP. Unfortunately, that s not yet possible. However, MagicDNS can query existing nameservers for certain domains in a split DNS setup.

Tailscale: Source code, pricing, and portability Tailscale is almost fully open source and the client is highly portable. The client is open source (BSD 3-clause) on open source platforms, and closed source on closed source platforms. The DERP servers are open source. The coordination server is closed source, although there is an open source coordination server called Headscale (also BSD 3-clause) made available with Tailscale s blessing and informal support. It supports most, but not all, features in the Tailscale coordination server. Tailscale s pricing (which does not apply when using Headscale) provides a free plan for 1 user with up to 20 devices. A Personal Pro plan expands that to 100 devices for $48 per year - not a bad deal at $4/mo. A Community on Github plan also exists, and then there are more business-oriented plans as well. See the pricing page for details. As a small note, I appreciated Tailscale s install script. It properly added Tailscale s apt key in a way that it can only be used to authenticate the Tailscale repo, rather than as a systemwide authenticator. This is a nice touch and speaks well of their developers.

Tailscale conclusions Tailscale is tops in sharing and has a broad feature set and excellent documentation. Like other solutions with a centralized control plane, device communications can stop working if the control plane is unreachable, and the threat model of the control plane should be carefully considered.

Zerotier Zerotier is a close competitor to Tailscale, and is similar to it in a lot of ways. So rather than duplicate all of the Tailscale information here, I m mainly going to describe how it differs from Tailscale. The primary difference between the two is that Zerotier emulates an Ethernet network via a Linux tap interface, while Tailscale emulates a TCP/IP network via a Linux tun interface. However, Zerotier has a number of things that make it be a somewhat imperfect Ethernet emulator. For one, it has a problem with broadcast amplification; the machine sending the broadcast sends it to all the other nodes that should receive it (up to a set maximum). I wouldn t want to have a lot of programs broadcasting on a slow link. While in theory this could let you run Netware or DECNet across Zerotier, I m not really convinced there s much call for that these days, and Zerotier is clearly IP-focused as it allocates IP addresses and such anyhow. Zerotier provides special support for emulated ARP (IPv4) and NDP (IPv6). While you could theoretically run Zerotier as a bridge, this eliminates the zero trust principle, and Tailscale supports subnet routers, which provide much of the same feature set anyhow. A somewhat obscure feature, but possibly useful, is Zerotier s built-in support for multipath WAN for the public interface. This actually lets you do a somewhat basic kind of channel bonding for WAN.

Zerotier: Security and Privacy The picture here is similar to Tailscale, with the difference that you can create a Zerotier-local account rather than relying on cloud authentication. I was unable to find as much detail about Zerotier as I could about Tailscale - notably I couldn t find anything about how sticky an IP address is. However, the configuration screen lets me delete a node and assign additional arbitrary IPs within a subnet to other nodes, so I think the assumption here is that if your Zerotier account (or the Zerotier control plane) is compromised, an attacker could remove a legit device, add a malicious one, and assign the previous IP of the legit device to the malicious one. I m not sure how to mitigate against that risk, as firewalling specific IPs is ineffective if an attacker can simply take them over. Zerotier also lacks anything akin to Tailnet Lock. For this reason, I didn t proceed much further in my Zerotier evaluation.

Zerotier: Connectivity and NAT traversal Like Tailscale, Zerotier has NAT traversal with STUN. However, it looks like it s more limited than Tailscale s, and in particular is incompatible with double NAT that is often seen these days. Zerotier operates brokers ( root servers ) that can do relaying, including TCP relaying. So you should be able to connect even from hostile networks, but you are less likely to form a P2P connection than with Tailscale.

Zerotier: Sharing with friends I was unable to find any special features relating to this in the Zerotier documentation. Therefore, it would be at the same level as Yggdrasil: possible, maybe even not too difficult, but without any specific help.

Zerotier: DNS Unlike Tailscale, Zerotier does not support automatically adding DNS entries for your hosts. Therefore, your options are approximately the same as Yggdrasil, though with the added option of pushing configuration pointing to your own non-Zerotier DNS servers to the client.

Zerotier: Source code, pricing, and portability The client ZeroTier One is available on Github under a custom business source license which prevents you from using it in certain settings. This license would preclude it being included in Debian. Their library, libzt, is available under the same license. The pricing page mentions a community edition for self hosting, but the documentation is sparse and it was difficult to understand what its feature set really is. The free plan lets you have 1 user with up to 25 devices. Paid plans are also available.

Zerotier conclusions Frankly I don t see much reason to use Zerotier. The virtual Ethernet model seems to be a weird hybrid that doesn t bring much value. I m concerned about the implications of a compromise of a user account or the control plane, and it lacks a lot of Tailscale features (MagicDNS and sharing). The only thing it may offer in particular is multipath WAN, but that s esoteric enough and also solvable at other layers that it doesn t seem all that compelling to me. Add to that the strange license and, to me anyhow, I don t see much reason to bother with it.

Netmaker Netmaker is one of the projects that is making noise these days. Netmaker is the only one here that is a wrapper around in-kernel Wireguard, which can make a performance difference when talking to peers on a 1Gbps or faster link. Also, unlike other tools, it has an ingress gateway feature that lets people that don t have the Netmaker client, but do have Wireguard, participate in the VPN. I believe I also saw a reference somewhere to nodes as routers as with Yggdrasil, but I m failing to dig it up now. The project is in a bit of an early state; you can sign up for an upcoming closed beta with a SaaS host, but really you are generally pointed to self-hosting using the code in the github repo. There are community and enterprise editions, but it s not clear how to actually choose. The server has a bunch of components: binary, CoreDNS, database, and web server. It also requires elevated privileges on the host, in addition to a container engine. Contrast that to the single binary that some others provide. It looks like releases are frequent, but sometimes break things, and have a somewhat more laborious upgrade processes than most. I don t want to spend a lot of time managing my mesh. So because of the heavy needs of the server, the upgrades being labor-intensive, it taking over iptables and such on the server, I didn t proceed with a more in-depth evaluation of Netmaker. It has a lot of promise, but for me, it doesn t seem to be in a state that will meet my needs yet.

Nebula Nebula is an interesting mesh project that originated within Slack, seems to still be primarily sponsored by Slack, but is also being developed by Defined Networking (though their product looks early right now). Unlike the other tools in this section, Nebula doesn t have a web interface at all. Defined Networking looks likely to provide something of a SaaS service, but for now, you will need to run a broker ( lighthouse ) yourself; perhaps on a $5/mo VPS. Due to the poor firewall traversal properties, I didn t do a full evaluation of Nebula, but it still has a very interesting design.

Nebula: Security and Privacy Since Nebula lacks a traditional control plane, the root of trust in Nebula is a CA (certificate authority). The documentation gives this example of setting it up:
./nebula-cert sign -name "lighthouse1" -ip "192.168.100.1/24"
./nebula-cert sign -name "laptop" -ip "192.168.100.2/24" -groups "laptop,home,ssh"
./nebula-cert sign -name "server1" -ip "192.168.100.9/24" -groups "servers"
./nebula-cert sign -name "host3" -ip "192.168.100.10/24"
So the cert contains your IP, hostname, and group allocation. Each host in the mesh gets your CA certificate, and the per-host cert and key generated from each of these steps. This leads to a really nice security model. Your CA is the gatekeeper to what is trusted in your mesh. You can even have it airgapped or something to make it exceptionally difficult to breach the perimeter. Nebula contains an integrated firewall. Because the ability to keep out unwanted nodes is so strong, I would say this may be the one mesh VPN you might consider using without bothering with an additional on-host firewall. You can define static mappings from a Nebula mesh IP to a clearnet IP. I haven t found information on this, but theoretically if NAT traversal isn t required, these static mappings may allow Nebula nodes to reach each other even if Internet is down. I don t know if this is truly the case, however.

Nebula: Connectivity and NAT traversal This is a weak point of Nebula. Nebula sends all traffic over a single UDP port; there is no provision for using TCP. This is an issue at certain hotel and other public networks which open only TCP egress ports 80 and 443. I couldn t find a lot of detail on what Nebula s NAT traversal is capable of, but according to a certain Github issue, this has been a sore spot for years and isn t as capable as Tailscale. You can designate nodes in Nebula as brokers (relays). The concept is the same as Yggdrasil, but it s less versatile. You have to manually designate what relay to use. It s unclear to me what happens if different nodes designate different relays. Keep in mind that this always happens over a UDP port.

Nebula: Sharing with friends There is no particular support here.

Nebula: DNS Nebula has experimental DNS support. In contrast with Tailscale, which has an internal DNS server on every node, Nebula only runs a DNS server on a lighthouse. This means that it can t forward requests to a DNS server that s upstream for your laptop s particular current location. Actually, Nebula s DNS server doesn t forward at all. It also doesn t resolve its own name. The Nebula documentation makes reference to using multiple lighthouses, which you may want to do for DNS redundancy or performance, but it s unclear to me if this would make each lighthouse form a complete picture of the network.

Nebula: Source code, pricing, and portability Nebula is fully open source (MIT). It consists of a single Go binary and configuration. It is fairly portable.

Nebula conclusions I am attracted to Nebula s unique security model. I would probably be more seriously considering it if not for the lack of support for TCP and poor general NAT traversal properties. Its datacenter connectivity heritage does show through.

Roll your own and hybrid Here is a grab bag of ideas:

Running Yggdrasil over Tailscale One possibility would be to use Tailscale for its superior NAT traversal, then allow Yggdrasil to run over it. (You will need a firewall to prevent Tailscale from trying to run over Yggdrasil at the same time!) This creates a closed network with all the benefits of Yggdrasil, yet getting the NAT traversal from Tailscale. Drawbacks might be the overhead of the double encryption and double encapsulation. A good Yggdrasil peer may wind up being faster than this anyhow.

Public VPN provider for NAT traversal A public VPN provider such as Mullvad will often offer incoming port forwarding and nodes in many cities. This could be an attractive way to solve a bunch of NAT traversal problems: just use one of those services to get you an incoming port, and run whatever you like over that. Be aware that a number of public VPN clients have a kill switch to prevent any traffic from egressing without using the VPN; see, for instance, Mullvad s. You ll need to disable this if you are running a mesh atop it.

Other

Combining with local firewalls For most of these tools, I recommend using a local firewal in conjunction with them. I have been using firehol and find it to be quite nice. This means you don t have to trust the mesh, the control plane, or whatever. The catch is that you do need your mesh VPN to provide strong association between IP address and node. Most, but not all, do.

Performance I tested some of these for performance using iperf3 on a 2.5Gbps LAN. Here are the results. All speeds are in Mbps.
Tool iperf3 (default) iperf3 -P 10 iperf3 -R
Direct (no VPN) 2406 2406 2764
Wireguard (kernel) 1515 1566 2027
Yggdrasil 892 1126 1105
Tailscale 950 1034 1085
Tinc 296 300 277
You can see that Wireguard was significantly faster than the other options. Tailscale and Yggdrasil were roughly comparable, and Tinc was terrible.

IP collisions When you are communicating over a network such as these, you need to trust that the IP address you are communicating with belongs to the system you think it does. This protects against two malicious actor scenarios:
  1. Someone compromises one machine on your mesh and reconfigures it to impersonate a more important one
  2. Someone connects an unauthorized system to the mesh, taking over a trusted IP, and uses the privileges of the trusted IP to access resources
To summarize the state of play as highlighted in the reviews above:
  • Yggdrasil derives IPv6 addresses from a public key
  • tinc allows any node to set any IP
  • Tailscale IPs aren t user-assignable, but the assignment algorithm is unknown
  • Zerotier allows any IP to be allocated to any node at the control plane
  • I don t know what Netmaker does
  • Nebula IPs are baked into the cert and signed by the CA, but I haven t verified the enforcement algorithm
So this discussion really only applies to Yggdrasil and Tailscale. tinc and Zerotier lack detailed IP security, while Nebula expects IP allocations to be handled outside of the tool and baked into the certs (therefore enforcing rigidity at that level). So the question for Yggdrasil and Tailscale is: how easy is it to commandeer a trusted IP? Yggdrasil has a brief discussion of this. In short, Yggdrasil offers you both a dedicated IP and a rarely-used /64 prefix which you can delegate to other machines on your LAN. Obviously by taking the dedicated IP, a lot more bits are available for the hash of the node s public key, making collisions technically impractical, if not outright impossible. However, if you use the /64 prefix, a collision may be more possible. Yggdrasil s hashing algorithm includes some optimizations to make this more difficult. Yggdrasil includes a genkeys tool that uses more CPU cycles to generate keys that are maximally difficult to collide with. Tailscale doesn t document their IP assignment algorithm, but I think it is safe to say that the larger subnet you use, the better. If you try to use a /24 for your mesh, it is certainly conceivable that an attacker could remove your trusted node, then just manually add the 240 or so machines it would take to get that IP reassigned. It might be a good idea to use a purely IPv6 mesh with Tailscale to minimize this problem as well. So, I think the risk is low in the default configurations of both Yggdrasil and Tailscale (certainly lower than with tinc or Zerotier). You can drive the risk even lower with both.

Final thoughts For my own purposes, I suspect I will remain with Yggdrasil in some fashion. Maybe I will just take the small performance hit that using a relay node implies. Or perhaps I will get clever and use an incoming VPN port forward or go over Tailscale. Tailscale was the other option that seemed most interesting. However, living in a region with Internet that goes down more often than I d like, I would like to just be able to send as much traffic over a mesh as possible, trusting that if the LAN is up, the mesh is up. I have one thing that really benefits from performance in excess of Yggdrasil or Tailscale: NFS. That s between two machines that never leave my LAN, so I will probably just set up a direct Wireguard link between them. Heck of a lot easier than trying to do Kerberos! Finally, I wrote this intending to be useful. I dealt with a lot of complexity and under-documentation, so it s possible I got something wrong somewhere. Please let me know if you find any errors.
This blog post is a copy of a page on my website. That page may be periodically updated.

6 April 2023

Michael Ablassmeier: tracking changes between pypi package releases

I wondered if there is some tracking for differences between packages published on pypi, something that stores this information in a format similar to debdiff.. I failed to find something on the web, so created a little utility which watches the pypi changelog for new releaes and fetches the new and old version. It uses diffoscope to create reports on the published releases and automatically pushes them to a github repository: https://github.com/pypi-diff Is it useful? I dont know, it may be handy for code review or maybe running different security scanners on it, to identify accidentaly pushed keys or other sensitive data. Currently its pushing the changes for every released package every 10 minutes, lets see how far this can go just for fun :-)

26 March 2023

Dirk Eddelbuettel: littler 0.3.18 on CRAN: New and Updated Scripts

max-heap image The nineteenth release of littler as a CRAN package landed this morning, following in the now seventeen year history (!!) as a package started by Jeff in 2006, and joined by me a few weeks later. littler is the first command-line interface for R as it predates Rscript. It allows for piping as well for shebang scripting via #!, uses command-line arguments more consistently and still starts faster. It also always loaded the methods package which Rscript only began to do in recent years. littler lives on Linux and Unix, has its difficulties on macOS due to yet-another-braindeadedness there (who ever thought case-insensitive filesystems as a default were a good idea?) and simply does not exist on Windows (yet the build system could be extended see RInside for an existence proof, and volunteers are welcome!). See the FAQ vignette on how to add it to your PATH. A few examples are highlighted at the Github repo, as well as in the examples vignette. This release is somewhat unique is not having any maintenance of the small core but rather focussing exclusively on changes to the (many !!) scripts collected in examples/scripts/. We added new ones and extended several existing ones including install2.r thanks to patches by @eitsupi. Of the new scripts, I already use tttl.r quite a bit for tests on co-maintained packages that use testthat (but as I have made known before my heart belongs firmly to tinytest as nobody should need thirty-plus recursive dependencies to run a few test predicates) as well as the very new installRub.r to directly add r-universe Ubuntu binaries onto an r2u systems as demonstrated in a first and second tweet (and corresponding toot) this week. The full change description follows.

Changes in littler version 0.3.18 (2023-03-25)
  • Changes in examples scripts
    • roxy.r can now set an additional --libpath
    • getRStudioDesktop.r and getRStudioServer.r have updated default download file
    • install2.r and installGithub.r can set --type
    • r2u.r now has a --suffix option
    • tt.r removes a redundant library call
    • tttl.r has been added for testthat::test_local()
    • installRub.r has been added to install r-universe binaries on Ubuntu
    • install2.r has updated error capture messages (Tatsuya Shima and Dirk in #104)

My CRANberries service provides a comparison to the previous release. Full details for the littler release are provided as usual at the ChangeLog page, and also on the package docs website. The code is available via the GitHub repo, from tarballs and now of course also from its CRAN page and via install.packages("littler"). Binary packages are available directly in Debian (currently in experimental due to a release freeze) as well as (in a day or two) Ubuntu binaries at CRAN thanks to the tireless Michael Rutter and of course via r2u following its next refresh. Comments and suggestions are welcome at the GitHub repo. If you like this or other open-source work I do, you can now sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

13 March 2023

Antoine Beaupr : Framework 12th gen laptop review

The Framework is a 13.5" laptop body with swappable parts, which makes it somewhat future-proof and certainly easily repairable, scoring an "exceedingly rare" 10/10 score from ifixit.com. There are two generations of the laptop's main board (both compatible with the same body): the Intel 11th and 12th gen chipsets. I have received my Framework, 12th generation "DIY", device in late September 2022 and will update this page as I go along in the process of ordering, burning-in, setting up and using the device over the years. Overall, the Framework is a good laptop. I like the keyboard, the touch pad, the expansion cards. Clearly there's been some good work done on industrial design, and it's the most repairable laptop I've had in years. Time will tell, but it looks sturdy enough to survive me many years as well. This is also one of the most powerful devices I ever lay my hands on. I have managed, remotely, more powerful servers, but this is the fastest computer I have ever owned, and it fits in this tiny case. It is an amazing machine. On the downside, there's a bit of proprietary firmware required (WiFi, Bluetooth, some graphics) and the Framework ships with a proprietary BIOS, with currently no Coreboot support. Expect to need the latest kernel, firmware, and hacking around a bunch of things to get resolution and keybindings working right. Like others, I have first found significant power management issues, but many issues can actually be solved with some configuration. Some of the expansion ports (HDMI, DP, MicroSD, and SSD) use power when idle, so don't expect week-long suspend, or "full day" battery while those are plugged in. Finally, the expansion ports are nice, but there's only four of them. If you plan to have a two-monitor setup, you're likely going to need a dock. Read on for the detailed review. For context, I'm moving from the Purism Librem 13v4 because it basically exploded on me. I had, in the meantime, reverted back to an old ThinkPad X220, so I sometimes compare the Framework with that venerable laptop as well. This blog post has been maturing for months now. It started in September 2022 and I declared it completed in March 2023. It's the longest single article on this entire website, currently clocking at about 13,000 words. It will take an average reader a full hour to go through this thing, so I don't expect anyone to actually do that. This introduction should be good enough for most people, read the first section if you intend to actually buy a Framework. Jump around the table of contents as you see fit for after you did buy the laptop, as it might include some crucial hints on how to make it work best for you, especially on (Debian) Linux.

Advice for buyers Those are things I wish I would have known before buying:
  1. consider buying 4 USB-C expansion cards, or at least a mix of 4 USB-A or USB-C cards, as they use less power than other cards and you do want to fill those expansion slots otherwise they snag around and feel insecure
  2. you will likely need a dock or at least a USB hub if you want a two-monitor setup, otherwise you'll run out of ports
  3. you have to do some serious tuning to get proper (10h+ idle, 10 days suspend) power savings
  4. in particular, beware that the HDMI, DisplayPort and particularly the SSD and MicroSD cards take a significant amount power, even when sleeping, up to 2-6W for the latter two
  5. beware that the MicroSD card is what it says: Micro, normal SD cards won't fit, and while there might be full sized one eventually, it's currently only at the prototyping stage
  6. the Framework monitor has an unusual aspect ratio (3:2): I like it (and it matches classic and digital photography aspect ratio), but it might surprise you

Current status I have the framework! It's setup with a fresh new Debian bookworm installation. I've ran through a large number of tests and burn in. I have decided to use the Framework as my daily driver, and had to buy a USB-C dock to get my two monitors connected, which was own adventure. Update: Framework just (2023-03-23) just announced a whole bunch of new stuff: The recording is available in this video and it's not your typical keynote. It starts ~25 minutes late, audio is crap, lightning and camera are crap, clapping seems to be from whatever staff they managed to get together in a room, decor is bizarre, colors are shit. It's amazing.

Specifications Those are the specifications of the 12th gen, in general terms. Your build will of course vary according to your needs.
  • CPU: i5-1240P, i7-1260P, or i7-1280P (Up to 4.4-4.8 GHz, 4+8 cores), Iris Xe graphics
  • Storage: 250-4000GB NVMe (or bring your own)
  • Memory: 8-64GB DDR4-3200 (or bring your own)
  • WiFi 6e (AX210, vPro optional, or bring your own)
  • 296.63mm X 228.98mm X 15.85mm, 1.3Kg
  • 13.5" display, 3:2 ratio, 2256px X 1504px, 100% sRGB, >400 nit
  • 4 x USB-C user-selectable expansion ports, including
    • USB-C
    • USB-A
    • HDMI
    • DP
    • Ethernet
    • MicroSD
    • 250-1000GB SSD
  • 3.5mm combo headphone jack
  • Kill switches for microphone and camera
  • Battery: 55Wh
  • Camera: 1080p 60fps
  • Biometrics: Fingerprint Reader
  • Backlit keyboard
  • Power Adapter: 60W USB-C (or bring your own)
  • ships with a screwdriver/spludger
  • 1 year warranty
  • base price: 1000$CAD, but doesn't give you much, typical builds around 1500-2000$CAD

Actual build This is the actual build I ordered. Amounts in CAD. (1CAD = ~0.75EUR/USD.)

Base configuration
  • CPU: Intel Core i5-1240P (AKA Alder Lake P 8 4.4GHz P-threads, 8 3.2GHz E-threads, 16 total, 28-64W), 1079$
  • Memory: 16GB (1 x 16GB) DDR4-3200, 104$

Customization
  • Keyboard: US English, included

Expansion Cards
  • 2 USB-C $24
  • 3 USB-A $36
  • 2 HDMI $50
  • 1 DP $50
  • 1 MicroSD $25
  • 1 Storage 1TB $199
  • Sub-total: 384$

Accessories
  • Power Adapter - US/Canada $64.00

Total
  • Before tax: 1606$
  • After tax and duties: 1847$
  • Free shipping

Quick evaluation This is basically the TL;DR: here, just focusing on broad pros/cons of the laptop.

Pros

Cons
  • the 11th gen is out of stock, except for the higher-end CPUs, which are much less affordable (700$+)
  • the 12th gen has compatibility issues with Debian, followup in the DebianOn page, but basically: brightness hotkeys, power management, wifi, the webcam is okay even though the chipset is the infamous alder lake because it does not have the fancy camera; most issues currently seem solvable, and upstream is working with mainline to get their shit working
  • 12th gen might have issues with thunderbolt docks
  • they used to have some difficulty keeping up with the orders: first two batches shipped, third batch sold out, fourth batch should have shipped (?) in October 2021. they generally seem to keep up with shipping. update (august 2022): they rolled out a second line of laptops (12th gen), first batch shipped, second batch shipped late, September 2022 batch was generally on time, see this spreadsheet for a crowdsourced effort to track those supply chain issues seem to be under control as of early 2023. I got the Ethernet expansion card shipped within a week.
  • compared to my previous laptop (Purism Librem 13v4), it feels strangely bulkier and heavier; it's actually lighter than the purism (1.3kg vs 1.4kg) and thinner (15.85mm vs 18mm) but the design of the Purism laptop (tapered edges) makes it feel thinner
  • no space for a 2.5" drive
  • rather bright LED around power button, but can be dimmed in the BIOS (not low enough to my taste) I got used to it
  • fan quiet when idle, but can be noisy when running, for example if you max a CPU for a while
  • battery described as "mediocre" by Ars Technica (above), confirmed poor in my tests (see below)
  • no RJ-45 port, and attempts at designing ones are failing because the modular plugs are too thin to fit (according to Linux After Dark), so unlikely to have one in the future Update: they cracked that nut and ship an 2.5 gbps Ethernet expansion card with a realtek chipset, without any firmware blob (!)
  • a bit pricey for the performance, especially when compared to the competition (e.g. Dell XPS, Apple M1)
  • 12th gen Intel has glitchy graphics, seems like Intel hasn't fully landed proper Linux support for that chipset yet

Initial hardware setup A breeze.

Accessing the board The internals are accessed through five TorX screws, but there's a nice screwdriver/spudger that works well enough. The screws actually hold in place so you can't even lose them. The first setup is a bit counter-intuitive coming from the Librem laptop, as I expected the back cover to lift and give me access to the internals. But instead the screws is release the keyboard and touch pad assembly, so you actually need to flip the laptop back upright and lift the assembly off (!) to get access to the internals. Kind of scary. I also actually unplugged a connector in lifting the assembly because I lifted it towards the monitor, while you actually need to lift it to the right. Thankfully, the connector didn't break, it just snapped off and I could plug it back in, no harm done. Once there, everything is well indicated, with QR codes all over the place supposedly leading to online instructions.

Bad QR codes Unfortunately, the QR codes I tested (in the expansion card slot, the memory slot and CPU slots) did not actually work so I wonder how useful those actually are. After all, they need to point to something and that means a URL, a running website that will answer those requests forever. I bet those will break sooner than later and in fact, as far as I can tell, they just don't work at all. I prefer the approach taken by the MNT reform here which designed (with the 100 rabbits folks) an actual paper handbook (PDF). The first QR code that's immediately visible from the back of the laptop, in an expansion cord slot, is a 404. It seems to be some serial number URL, but I can't actually tell because, well, the page is a 404. I was expecting that bar code to lead me to an introduction page, something like "how to setup your Framework laptop". Support actually confirmed that it should point a quickstart guide. But in a bizarre twist, they somehow sent me the URL with the plus (+) signs escaped, like this:
https://guides.frame.work/Guide/Framework\+Laptop\+DIY\+Edition\+Quick\+Start\+Guide/57
... which Firefox immediately transforms in:
https://guides.frame.work/Guide/Framework/+Laptop/+DIY/+Edition/+Quick/+Start/+Guide/57
I'm puzzled as to why they would send the URL that way, the proper URL is of course:
https://guides.frame.work/Guide/Framework+Laptop+DIY+Edition+Quick+Start+Guide/57
(They have also "let the team know about this for feedback and help resolve the problem with the link" which is a support code word for "ha-ha! nope! not my problem right now!" Trust me, I know, my own code word is "can you please make a ticket?")

Seating disks and memory The "DIY" kit doesn't actually have that much of a setup. If you bought RAM, it's shipped outside the laptop in a little plastic case, so you just seat it in as usual. Then you insert your NVMe drive, and, if that's your fancy, you also install your own mPCI WiFi card. If you ordered one (which was my case), it's pre-installed. Closing the laptop is also kind of amazing, because the keyboard assembly snaps into place with magnets. I have actually used the laptop with the keyboard unscrewed as I was putting the drives in and out, and it actually works fine (and will probably void your warranty, so don't do that). (But you can.) (But don't, really.)

Hardware review

Keyboard and touch pad The keyboard feels nice, for a laptop. I'm used to mechanical keyboard and I'm rather violent with those poor things. Yet the key travel is nice and it's clickety enough that I don't feel too disoriented. At first, I felt the keyboard as being more laggy than my normal workstation setup, but it turned out this was a graphics driver issues. After enabling a composition manager, everything feels snappy. The touch pad feels good. The double-finger scroll works well enough, and I don't have to wonder too much where the middle button is, it just works. Taps don't work, out of the box: that needs to be enabled in Xorg, with something like this:
cat > /etc/X11/xorg.conf.d/40-libinput.conf <<EOF
Section "InputClass"
      Identifier "libinput touch pad catchall"
      MatchIsTouchpad "on"
      MatchDevicePath "/dev/input/event*"
      Driver "libinput"
      Option "Tapping" "on"
      Option "TappingButtonMap" "lmr"
EndSection
EOF
But be aware that once you enable that tapping, you'll need to deal with palm detection... So I have not actually enabled this in the end.

Power button The power button is a little dangerous. It's quite easy to hit, as it's right next to one expansion card where you are likely to plug in a cable power. And because the expansion cards are kind of hard to remove, you might squeeze the laptop (and the power key) when trying to remove the expansion card next to the power button. So obviously, don't do that. But that's not very helpful. An alternative is to make the power button do something else. With systemd-managed systems, it's actually quite easy. Add a HandlePowerKey stanza to (say) /etc/systemd/logind.conf.d/power-suspends.conf:
[Login]
HandlePowerKey=suspend
HandlePowerKeyLongPress=poweroff
You might have to create the directory first:
mkdir /etc/systemd/logind.conf.d/
Then restart logind:
systemctl restart systemd-logind
And the power button will suspend! Long-press to power off doesn't actually work as the laptop immediately suspends... Note that there's probably half a dozen other ways of doing this, see this, this, or that.

Special keybindings There is a series of "hidden" (as in: not labeled on the key) keybindings related to the fn keybinding that I actually find quite useful.
Key Equivalent Effect Command
p Pause lock screen xset s activate
b Break ? ?
k ScrLk switch keyboard layout N/A
It looks like those are defined in the microcontroller so it would be possible to add some. For example, the SysRq key is almost bound to fn s in there. Note that most other shortcuts like this are clearly documented (volume, brightness, etc). One key that's less obvious is F12 that only has the Framework logo on it. That actually calls the keysym XF86AudioMedia which, interestingly, does absolutely nothing here. By default, on Windows, it opens your browser to the Framework website and, on Linux, your "default media player". The keyboard backlight can be cycled with fn-space. The dimmer version is dim enough, and the keybinding is easy to find in the dark. A skinny elephant would be performed with alt PrtScr (above F11) KEY, so for example alt fn F11 b should do a hard reset. This comment suggests you need to hold the fn only if "function lock" is on, but that's actually the opposite of my experience. Out of the box, some of the fn keys don't work. Mute, volume up/down, brightness, monitor changes, and the airplane mode key all do basically nothing. They don't send proper keysyms to Xorg at all. This is a known problem and it's related to the fact that the laptop has light sensors to adjust the brightness automatically. Somehow some of those keys (e.g. the brightness controls) are supposed to show up as a different input device, but don't seem to work correctly. It seems like the solution is for the Framework team to write a driver specifically for this, but so far no progress since July 2022. In the meantime, the fancy functionality can be supposedly disabled with:
echo 'blacklist hid_sensor_hub'   sudo tee /etc/modprobe.d/framework-als-blacklist.conf
... and a reboot. This solution is also documented in the upstream guide. Note that there's another solution flying around that fixes this by changing permissions on the input device but I haven't tested that or seen confirmation it works.

Kill switches The Framework has two "kill switches": one for the camera and the other for the microphone. The camera one actually disconnects the USB device when turned off, and the mic one seems to cut the circuit. It doesn't show up as muted, it just stops feeding the sound. Both kill switches are around the main camera, on top of the monitor, and quite discreet. Then turn "red" when enabled (i.e. "red" means "turned off").

Monitor The monitor looks pretty good to my untrained eyes. I have yet to do photography work on it, but some photos I looked at look sharp and the colors are bright and lively. The blacks are dark and the screen is bright. I have yet to use it in full sunlight. The dimmed light is very dim, which I like.

Screen backlight I bind brightness keys to xbacklight in i3, but out of the box I get this error:
sep 29 22:09:14 angela i3[5661]: No outputs have backlight property
It just requires this blob in /etc/X11/xorg.conf.d/backlight.conf:
Section "Device"
    Identifier  "Card0"
    Driver      "intel"
    Option      "Backlight"  "intel_backlight"
EndSection
This way I can control the actual backlight power with the brightness keys, and they do significantly reduce power usage.

Multiple monitor support I have been able to hook up my two old monitors to the HDMI and DisplayPort expansion cards on the laptop. The lid closes without suspending the machine, and everything works great. I actually run out of ports, even with a 4-port USB-A hub, which gives me a total of 7 ports:
  1. power (USB-C)
  2. monitor 1 (DisplayPort)
  3. monitor 2 (HDMI)
  4. USB-A hub, which adds:
  5. keyboard (USB-A)
  6. mouse (USB-A)
  7. Yubikey
  8. external sound card
Now the latter, I might be able to get rid of if I switch to a combo-jack headset, which I do have (and still need to test). But still, this is a problem. I'll probably need a powered USB-C dock and better monitors, possibly with some Thunderbolt chaining, to save yet more ports. But that means more money into this setup, argh. And figuring out my monitor situation is the kind of thing I'm not that big of a fan of. And neither is shopping for USB-C (or is it Thunderbolt?) hubs. My normal autorandr setup doesn't work: I have tried saving a profile and it doesn't get autodetected, so I also first need to do:
autorandr -l framework-external-dual-lg-acer
The magic:
autorandr -l horizontal
... also works well. The worst problem with those monitors right now is that they have a radically smaller resolution than the main screen on the laptop, which means I need to reset the font scaling to normal every time I switch back and forth between those monitors and the laptop, which means I actually need to do this:
autorandr -l horizontal &&
eho Xft.dpi: 96   xrdb -merge &&
systemctl restart terminal xcolortaillog background-image emacs &&
i3-msg restart
Kind of disruptive.

Expansion ports I ordered a total of 10 expansion ports. I did manage to initialize the 1TB drive as an encrypted storage, mostly to keep photos as this is something that takes a massive amount of space (500GB and counting) and that I (unfortunately) don't work on very often (but still carry around). The expansion ports are fancy and nice, but not actually that convenient. They're a bit hard to take out: you really need to crimp your fingernails on there and pull hard to take them out. There's a little button next to them to release, I think, but at first it feels a little scary to pull those pucks out of there. You get used to it though, and it's one of those things you can do without looking eventually. There's only four expansion ports. Once you have two monitors, the drive, and power plugged in, bam, you're out of ports; there's nowhere to plug my Yubikey. So if this is going to be my daily driver, with a dual monitor setup, I will need a dock, which means more crap firmware and uncertainty, which isn't great. There are actually plans to make a dual-USB card, but that is blocked on designing an actual board for this. I can't wait to see more expansion ports produced. There's a ethernet expansion card which quickly went out of stock basically the day it was announced, but was eventually restocked. I would like to see a proper SD-card reader. There's a MicroSD card reader, but that obviously doesn't work for normal SD cards, which would be more broadly compatible anyways (because you can have a MicroSD to SD card adapter, but I have never heard of the reverse). Someone actually found a SD card reader that fits and then someone else managed to cram it in a 3D printed case, which is kind of amazing. Still, I really like that idea that I can carry all those little adapters in a pouch when I travel and can basically do anything I want. It does mean I need to shuffle through them to find the right one which is a little annoying. I have an elastic band to keep them lined up so that all the ports show the same side, to make it easier to find the right one. But that quickly gets undone and instead I have a pouch full of expansion cards. Another awesome thing with the expansion cards is that they don't just work on the laptop: anything that takes USB-C can take those cards, which means you can use it to connect an SD card to your phone, for backups, for example. Heck, you could even connect an external display to your phone that way, assuming that's supported by your phone of course (and it probably isn't). The expansion ports do take up some power, even when idle. See the power management section below, and particularly the power usage tests for details.

USB-C charging One thing that is really a game changer for me is USB-C charging. It's hard to overstate how convenient this is. I often have a USB-C cable lying around to charge my phone, and I can just grab that thing and pop it in my laptop. And while it will obviously not charge as fast as the provided charger, it will stop draining the battery at least. (As I wrote this, I had the laptop plugged in the Samsung charger that came with a phone, and it was telling me it would take 6 hours to charge the remaining 15%. With the provided charger, that flew down to 15 minutes. Similarly, I can power the laptop from the power grommet on my desk, reducing clutter as I have that single wire out there instead of the bulky power adapter.) I also really like the idea that I can charge my laptop with a power bank or, heck, with my phone, if push comes to shove. (And vice-versa!) This is awesome. And it works from any of the expansion ports, of course. There's a little led next to the expansion ports as well, which indicate the charge status:
  • red/amber: charging
  • white: charged
  • off: unplugged
I couldn't find documentation about this, but the forum answered. This is something of a recurring theme with the Framework. While it has a good knowledge base and repair/setup guides (and the forum is awesome) but it doesn't have a good "owner manual" that shows you the different parts of the laptop and what they do. Again, something the MNT reform did well. Another thing that people are asking about is an external sleep indicator: because the power LED is on the main keyboard assembly, you don't actually see whether the device is active or not when the lid is closed. Finally, I wondered what happens when you plug in multiple power sources and it turns out the charge controller is actually pretty smart: it will pick the best power source and use it. The only downside is it can't use multiple power sources, but that seems like a bit much to ask.

Multimedia and other devices Those things also work:
  • webcam: splendid, best webcam I've ever had (but my standards are really low)
  • onboard mic: works well, good gain (maybe a bit much)
  • onboard speakers: sound okay, a little metal-ish, loud enough to be annoying, see this thread for benchmarks, apparently pretty good speakers
  • combo jack: works, with slight hiss, see below
There's also a light sensor, but it conflicts with the keyboard brightness controls (see above). There's also an accelerometer, but it's off by default and will be removed from future builds.

Combo jack mic tests The Framework laptop ships with a combo jack on the left side, which allows you to plug in a CTIA (source) headset. In human terms, it's a device that has both a stereo output and a mono input, typically a headset or ear buds with a microphone somewhere. It works, which is better than the Purism (which only had audio out), but is on par for the course for that kind of onboard hardware. Because of electrical interference, such sound cards very often get lots of noise from the board. With a Jabra Evolve 40, the built-in USB sound card generates basically zero noise on silence (invisible down to -60dB in Audacity) while plugging it in directly generates a solid -30dB hiss. There is a noise-reduction system in that sound card, but the difference is still quite striking. On a comparable setup (curie, a 2017 Intel NUC), there is also a his with the Jabra headset, but it's quieter, more in the order of -40/-50 dB, a noticeable difference. Interestingly, testing with my Mee Audio Pro M6 earbuds leads to a little more hiss on curie, more on the -35/-40 dB range, close to the Framework. Also note that another sound card, the Antlion USB adapter that comes with the ModMic 4, also gives me pretty close to silence on a quiet recording, picking up less than -50dB of background noise. It's actually probably picking up the fans in the office, which do make audible noises. In other words, the hiss of the sound card built in the Framework laptop is so loud that it makes more noise than the quiet fans in the office. Or, another way to put it is that two USB sound cards (the Jabra and the Antlion) are able to pick up ambient noise in my office but not the Framework laptop. See also my audio page.

Performance tests

Compiling Linux 5.19.11 On a single core, compiling the Debian version of the Linux kernel takes around 100 minutes:
5411.85user 673.33system 1:37:46elapsed 103%CPU (0avgtext+0avgdata 831700maxresident)k
10594704inputs+87448000outputs (9131major+410636783minor)pagefaults 0swaps
This was using 16 watts of power, with full screen brightness. With all 16 cores (make -j16), it takes less than 25 minutes:
19251.06user 2467.47system 24:13.07elapsed 1494%CPU (0avgtext+0avgdata 831676maxresident)k
8321856inputs+87427848outputs (30792major+409145263minor)pagefaults 0swaps
I had to plug the normal power supply after a few minutes because battery would actually run out using my desk's power grommet (34 watts). During compilation, fans were spinning really hard, quite noisy, but not painfully so. The laptop was sucking 55 watts of power, steadily:
  Time    User  Nice   Sys  Idle    IO  Run Ctxt/s  IRQ/s Fork Exec Exit  Watts
-------- ----- ----- ----- ----- ----- ---- ------ ------ ---- ---- ---- ------
 Average  87.9   0.0  10.7   1.4   0.1 17.8 6583.6 5054.3 233.0 223.9 233.1  55.96
 GeoMean  87.9   0.0  10.6   1.2   0.0 17.6 6427.8 5048.1 227.6 218.7 227.7  55.96
  StdDev   1.4   0.0   1.2   0.6   0.2  3.0 1436.8  255.5 50.0 47.5 49.7   0.20
-------- ----- ----- ----- ----- ----- ---- ------ ------ ---- ---- ---- ------
 Minimum  85.0   0.0   7.8   0.5   0.0 13.0 3594.0 4638.0 117.0 111.0 120.0  55.52
 Maximum  90.8   0.0  12.9   3.5   0.8 38.0 10174.0 5901.0 374.0 362.0 375.0  56.41
-------- ----- ----- ----- ----- ----- ---- ------ ------ ---- ---- ---- ------
Summary:
CPU:  55.96 Watts on average with standard deviation 0.20
Note: power read from RAPL domains: package-0, uncore, package-0, core, psys.
These readings do not cover all the hardware in this device.

memtest86+ I ran Memtest86+ v6.00b3. It shows something like this:
Memtest86+ v6.00b3        12th Gen Intel(R) Core(TM) i5-1240P
CLK/Temp: 2112MHz    78/78 C   Pass  2% #
L1 Cache:   48KB    414 GB/s   Test 46% ##################
L2 Cache: 1.25MB    118 GB/s   Test #3 [Moving inversions, 1s & 0s] 
L3 Cache:   12MB     43 GB/s   Testing: 16GB - 18GB [1GB of 15.7GB]
Memory  :  15.7GB  14.9 GB/s   Pattern: 
--------------------------------------------------------------------------------
CPU: 4P+8E-Cores (16T)    SMP: 8T (PAR))    Time:  0:27:23  Status: Pass     \
RAM: 1600MHz (DDR4-3200) CAS 22-22-22-51    Pass:  1        Errors: 0
--------------------------------------------------------------------------------
Memory SPD Information
----------------------
 - Slot 2: 16GB DDR-4-3200 - Crucial CT16G4SFRA32A.C16FP (2022-W23)
                          Framework FRANMACP04
 <ESC> Exit  <F1> Configuration  <Space> Scroll Lock            6.00.unknown.x64
So about 30 minutes for a full 16GB memory test.

Software setup Once I had everything in the hardware setup, I figured, voil , I'm done, I'm just going to boot this beautiful machine and I can get back to work. I don't understand why I am so na ve some times. It's mind boggling. Obviously, it didn't happen that way at all, and I spent the best of the three following days tinkering with the laptop.

Secure boot and EFI First, I couldn't boot off of the NVMe drive I transferred from the previous laptop (the Purism) and the BIOS was not very helpful: it was just complaining about not finding any boot device, without dropping me in the real BIOS. At first, I thought it was a problem with my NVMe drive, because it's not listed in the compatible SSD drives from upstream. But I figured out how to enter BIOS (press F2 manically, of course), which showed the NVMe drive was actually detected. It just didn't boot, because it was an old (2010!!) Debian install without EFI. So from there, I disabled secure boot, and booted a grml image to try to recover. And by "boot" I mean, I managed to get to the grml boot loader which promptly failed to load its own root file system somehow. I still have to investigate exactly what happened there, but it failed some time after the initrd load with:
Unable to find medium containing a live file system
This, it turns out, was fixed in Debian lately, so a daily GRML build will not have this problems. The upcoming 2022 release (likely 2022.10 or 2022.11) will also get the fix. I did manage to boot the development version of the Debian installer which was a surprisingly good experience: it mounted the encrypted drives and did everything pretty smoothly. It even offered me to reinstall the boot loader, but that ultimately (and correctly, as it turns out) failed because I didn't have a /boot/efi partition. At this point, I realized there was no easy way out of this, and I just proceeded to completely reinstall Debian. I had a spare NVMe drive lying around (backups FTW!) so I just swapped that in, rebooted in the Debian installer, and did a clean install. I wanted to switch to bookworm anyways, so I guess that's done too.

Storage limitations Another thing that happened during setup is that I tried to copy over the internal 2.5" SSD drive from the Purism to the Framework 1TB expansion card. There's no 2.5" slot in the new laptop, so that's pretty much the only option for storage expansion. I was tired and did something wrong. I ended up wiping the partition table on the original 2.5" drive. Oops. It might be recoverable, but just restoring the partition table didn't work either, so I'm not sure how I recover the data there. Normally, everything on my laptops and workstations is designed to be disposable, so that wasn't that big of a problem. I did manage to recover most of the data thanks to git-annex reinit, but that was a little hairy.

Bootstrapping Puppet Once I had some networking, I had to install all the packages I needed. The time I spent setting up my workstations with Puppet has finally paid off. What I actually did was to restore two critical directories:
/etc/ssh
/var/lib/puppet
So that I would keep the previous machine's identity. That way I could contact the Puppet server and install whatever was missing. I used my Puppet optimization trick to do a batch install and then I had a good base setup, although not exactly as it was before. 1700 packages were installed manually on angela before the reinstall, and not in Puppet. I did not inspect each one individually, but I did go through /etc and copied over more SSH keys, for backups and SMTP over SSH.

LVFS support It looks like there's support for the (de-facto) standard LVFS firmware update system. At least I was able to update the UEFI firmware with a simple:
apt install fwupd-amd64-signed
fwupdmgr refresh
fwupdmgr get-updates
fwupdmgr update
Nice. The 12th gen BIOS updates, currently (January 2023) beta, can be deployed through LVFS with:
fwupdmgr enable-remote lvfs-testing
echo 'DisableCapsuleUpdateOnDisk=true' >> /etc/fwupd/uefi_capsule.conf 
fwupdmgr update
Those instructions come from the beta forum post. I performed the BIOS update on 2023-01-16T16:00-0500.

Resolution tweaks The Framework laptop resolution (2256px X 1504px) is big enough to give you a pretty small font size, so welcome to the marvelous world of "scaling". The Debian wiki page has a few tricks for this.

Console This will make the console and grub fonts more readable:
cat >> /etc/default/console-setup <<EOF
FONTFACE="Terminus"
FONTSIZE=32x16
EOF
echo GRUB_GFXMODE=1024x768 >> /etc/default/grub
update-grub

Xorg Adding this to your .Xresources will make everything look much bigger:
! 1.5*96
Xft.dpi: 144
Apparently, some of this can also help:
! These might also be useful depending on your monitor and personal preference:
Xft.autohint: 0
Xft.lcdfilter:  lcddefault
Xft.hintstyle:  hintfull
Xft.hinting: 1
Xft.antialias: 1
Xft.rgba: rgb
It my experience it also makes things look a little fuzzier, which is frustrating because you have this awesome monitor but everything looks out of focus. Just bumping Xft.dpi by a 1.5 factor looks good to me. The Debian Wiki has a page on HiDPI, but it's not as good as the Arch Wiki, where the above blurb comes from. I am not using the latter because I suspect it's causing some of the "fuzziness". TODO: find the equivalent of this GNOME hack in i3? (gsettings set org.gnome.mutter experimental-features "['scale-monitor-framebuffer']"), taken from this Framework guide

Issues

BIOS configuration The Framework BIOS has some minor issues. One issue I personally encountered is that I had disabled Quick boot and Quiet boot in the BIOS to diagnose the above boot issues. This, in turn, triggers a bug where the BIOS boot manager (F12) would just hang completely. It would also fail to boot from an external USB drive. The current fix (as of BIOS 3.03) is to re-enable both Quick boot and Quiet boot. Presumably this is something that will get fixed in a future BIOS update. Note that the following keybindings are active in the BIOS POST check:
Key Meaning
F2 Enter BIOS setup menu
F12 Enter BIOS boot manager
Delete Enter BIOS setup menu

WiFi compatibility issues I couldn't make WiFi work at first. Obviously, the default Debian installer doesn't ship with proprietary firmware (although that might change soon) so the WiFi card didn't work out of the box. But even after copying the firmware through a USB stick, I couldn't quite manage to find the right combination of ip/iw/wpa-supplicant (yes, after repeatedly copying a bunch more packages over to get those bootstrapped). (Next time I should probably try something like this post.) Thankfully, I had a little USB-C dongle with a RJ-45 jack lying around. That also required a firmware blob, but it was a single package to copy over, and with that loaded, I had network. Eventually, I did managed to make WiFi work; the problem was more on the side of "I forgot how to configure a WPA network by hand from the commandline" than anything else. NetworkManager worked fine and got WiFi working correctly. Note that this is with Debian bookworm, which has the 5.19 Linux kernel, and with the firmware-nonfree (firmware-iwlwifi, specifically) package.

Battery life I was having between about 7 hours of battery on the Purism Librem 13v4, and that's after a year or two of battery life. Now, I still have about 7 hours of battery life, which is nicer than my old ThinkPad X220 (20 minutes!) but really, it's not that good for a new generation laptop. The 12th generation Intel chipset probably improved things compared to the previous one Framework laptop, but I don't have a 11th gen Framework to compare with). (Note that those are estimates from my status bar, not wall clock measurements. They should still be comparable between the Purism and Framework, that said.) The battery life doesn't seem up to, say, Dell XPS 13, ThinkPad X1, and of course not the Apple M1, where I would expect 10+ hours of battery life out of the box. That said, I do get those kind estimates when the machine is fully charged and idle. In fact, when everything is quiet and nothing is plugged in, I get dozens of hours of battery life estimated (I've seen 25h!). So power usage fluctuates quite a bit depending on usage, which I guess is expected. Concretely, so far, light web browsing, reading emails and writing notes in Emacs (e.g. this file) takes about 8W of power:
Time    User  Nice   Sys  Idle    IO  Run Ctxt/s  IRQ/s Fork Exec Exit  Watts
-------- ----- ----- ----- ----- ----- ---- ------ ------ ---- ---- ---- ------
 Average   1.7   0.0   0.5  97.6   0.2  1.2 4684.9 1985.2 126.6 39.1 128.0   7.57
 GeoMean   1.4   0.0   0.4  97.6   0.1  1.2 4416.6 1734.5 111.6 27.9 113.3   7.54
  StdDev   1.0   0.2   0.2   1.2   0.0  0.5 1584.7 1058.3 82.1 44.0 80.2   0.71
-------- ----- ----- ----- ----- ----- ---- ------ ------ ---- ---- ---- ------
 Minimum   0.2   0.0   0.2  94.9   0.1  1.0 2242.0  698.2 82.0 17.0 82.0   6.36
 Maximum   4.1   1.1   1.0  99.4   0.2  3.0 8687.4 4445.1 463.0 249.0 449.0   9.10
-------- ----- ----- ----- ----- ----- ---- ------ ------ ---- ---- ---- ------
Summary:
System:   7.57 Watts on average with standard deviation 0.71
Expansion cards matter a lot in the battery life (see below for a thorough discussion), my normal setup is 2xUSB-C and 1xUSB-A (yes, with an empty slot, and yes, to save power). Interestingly, playing a video in a (720p) window in a window takes up more power (10.5W) than in full screen (9.5W) but I blame that on my desktop setup (i3 + compton)... Not sure if mpv hits the VA-API, maybe not in windowed mode. Similar results with 1080p, interestingly, except the window struggles to keep up altogether. Full screen playback takes a relatively comfortable 9.5W, which means a solid 5h+ of playback, which is fine by me. Fooling around the web, small edits, youtube-dl, and I'm at around 80% battery after about an hour, with an estimated 5h left, which is a little disappointing. I had a 7h remaining estimate before I started goofing around Discourse, so I suspect the website is a pretty big battery drain, actually. I see about 10-12 W, while I was probably at half that (6-8W) just playing music with mpv in the background... In other words, it looks like editing posts in Discourse with Firefox takes a solid 4-6W of power. Amazing and gross. (When writing about abusive power usage generates more power usage, is that an heisenbug? Or schr dinbug?)

Power management Compared to the Purism Librem 13v4, the ongoing power usage seems to be slightly better. An anecdotal metric is that the Purism would take 800mA idle, while the more powerful Framework manages a little over 500mA as I'm typing this, fluctuating between 450 and 600mA. That is without any active expansion card, except the storage. Those numbers come from the output of tlp-stat -b and, unfortunately, the "ampere" unit makes it quite hard to compare those, because voltage is not necessarily the same between the two platforms.
  • TODO: review Arch Linux's tips on power saving
  • TODO: i915 driver has a lot of parameters, including some about power saving, see, again, the arch wiki, and particularly enable_fbc=1
TL:DR; power management on the laptop is an issue, but there's various tweaks you can make to improve it. Try:
  • powertop --auto-tune
  • apt install tlp && systemctl enable tlp
  • nvme.noacpi=1 mem_sleep_default=deep on the kernel command line may help with standby power usage
  • keep only USB-C expansion cards plugged in, all others suck power even when idle
  • consider upgrading the BIOS to latest beta (3.06 at the time of writing), unverified power savings
  • latest Linux kernels (6.2) promise power savings as well (unverified)
Update: also try to follow the official optimization guide. It was made for Ubuntu but will probably also work for your distribution of choice with a few tweaks. They recommend using tlpui but it's not packaged in Debian. There is, however, a Flatpak release. In my case, it resulted in the following diff to tlp.conf: tlp.patch.

Background on CPU architecture There were power problems in the 11th gen Framework laptop, according to this report from Linux After Dark, so the issues with power management on the Framework are not new. The 12th generation Intel CPU (AKA "Alder Lake") is a big-little architecture with "power-saving" and "performance" cores. There used to be performance problems introduced by the scheduler in Linux 5.16 but those were eventually fixed in 5.18, which uses Intel's hardware as an "intelligent, low-latency hardware-assisted scheduler". According to Phoronix, the 5.19 release improved the power saving, at the cost of some penalty cost. There were also patch series to make the scheduler configurable, but it doesn't look those have been merged as of 5.19. There was also a session about this at the 2022 Linux Plumbers, but they stopped short of talking more about the specific problems Linux is facing in Alder lake:
Specifically, the kernel's energy-aware scheduling heuristics don't work well on those CPUs. A number of features present there complicate the energy picture; these include SMT, Intel's "turbo boost" mode, and the CPU's internal power-management mechanisms. For many workloads, running on an ostensibly more power-hungry Pcore can be more efficient than using an Ecore. Time for discussion of the problem was lacking, though, and the session came to a close.
All this to say that the 12gen Intel line shipped with this Framework series should have better power management thanks to its power-saving cores. And Linux has had the scheduler changes to make use of this (but maybe is still having trouble). In any case, this might not be the source of power management problems on my laptop, quite the opposite. Also note that the firmware updates for various chipsets are supposed to improve things eventually. On the other hand, The Verge simply declared the whole P-series a mistake...

Attempts at improving power usage I did try to follow some of the tips in this forum post. The tricks powertop --auto-tune and tlp's PCIE_ASPM_ON_BAT=powersupersave basically did nothing: I was stuck at 10W power usage in powertop (600+mA in tlp-stat). Apparently, I should be able to reach the C8 CPU power state (or even C9, C10) in powertop, but I seem to be stock at C7. (Although I'm not sure how to read that tab in powertop: in the Core(HW) column there's only C3/C6/C7 states, and most cores are 85% in C7 or maybe C6. But the next column over does show many CPUs in C10 states... As it turns out, the graphics card actually takes up a good chunk of power unless proper power management is enabled (see below). After tweaking this, I did manage to get down to around 7W power usage in powertop. Expansion cards actually do take up power, and so does the screen, obviously. The fully-lit screen takes a solid 2-3W of power compared to the fully dimmed screen. When removing all expansion cards and making the laptop idle, I can spin it down to 4 watts power usage at the moment, and an amazing 2 watts when the screen turned off.

Caveats Abusive (10W+) power usage that I initially found could be a problem with my desktop configuration: I have this silly status bar that updates every second and probably causes redraws... The CPU certainly doesn't seem to spin down below 1GHz. Also note that this is with an actual desktop running with everything: it could very well be that some things (I'm looking at you Signal Desktop) take up unreasonable amount of power on their own (hello, 1W/electron, sheesh). Syncthing and containerd (Docker!) also seem to take a good 500mW just sitting there. Beyond my desktop configuration, this could, of course, be a Debian-specific problem; your favorite distribution might be better at power management.

Idle power usage tests Some expansion cards waste energy, even when unused. Here is a summary of the findings from the powerstat page. I also include other devices tested in this page for completeness:
Device Minimum Average Max Stdev Note
Screen, 100% 2.4W 2.6W 2.8W N/A
Screen, 1% 30mW 140mW 250mW N/A
Backlight 1 290mW ? ? ? fairly small, all things considered
Backlight 2 890mW 1.2W 3W? 460mW? geometric progression
Backlight 3 1.69W 1.5W 1.8W? 390mW? significant power use
Radios 100mW 250mW N/A N/A
USB-C N/A N/A N/A N/A negligible power drain
USB-A 10mW 10mW ? 10mW almost negligible
DisplayPort 300mW 390mW 600mW N/A not passive
HDMI 380mW 440mW 1W? 20mW not passive
1TB SSD 1.65W 1.79W 2W 12mW significant, probably higher when busy
MicroSD 1.6W 3W 6W 1.93W highest power usage, possibly even higher when busy
Ethernet 1.69W 1.64W 1.76W N/A comparable to the SSD card
So it looks like all expansion cards but the USB-C ones are active, i.e. they draw power with idle. The USB-A cards are the least concern, sucking out 10mW, pretty much within the margin of error. But both the DisplayPort and HDMI do take a few hundred miliwatts. It looks like USB-A connectors have this fundamental flaw that they necessarily draw some powers because they lack the power negotiation features of USB-C. At least according to this post:
It seems the USB A must have power going to it all the time, that the old USB 2 and 3 protocols, the USB C only provides power when there is a connection. Old versus new.
Apparently, this is a problem specific to the USB-C to USB-A adapter that ships with the Framework. Some people have actually changed their orders to all USB-C because of this problem, but I'm not sure the problem is as serious as claimed in the forums. I couldn't reproduce the "one watt" power drains suggested elsewhere, at least not repeatedly. (A previous version of this post did show such a power drain, but it was in a less controlled test environment than the series of more rigorous tests above.) The worst offenders are the storage cards: the SSD drive takes at least one watt of power and the MicroSD card seems to want to take all the way up to 6 watts of power, both just sitting there doing nothing. This confirms claims of 1.4W for the SSD (but not 5W) power usage found elsewhere. The former post has instructions on how to disable the card in software. The MicroSD card has been reported as using 2 watts, but I've seen it as high as 6 watts, which is pretty damning. The Framework team has a beta update for the DisplayPort adapter but currently only for Windows (LVFS technically possible, "under investigation"). A USB-A firmware update is also under investigation. It is therefore likely at least some of those power management issues will eventually be fixed. Note that the upcoming Ethernet card has a reported 2-8W power usage, depending on traffic. I did my own power usage tests in powerstat-wayland and they seem lower than 2W. The upcoming 6.2 Linux kernel might also improve battery usage when idle, see this Phoronix article for details, likely in early 2023.

Idle power usage tests under Wayland Update: I redid those tests under Wayland, see powerstat-wayland for details. The TL;DR: is that power consumption is either smaller or similar.

Idle power usage tests, 3.06 beta BIOS I redid the idle tests after the 3.06 beta BIOS update and ended up with this results:
Device Minimum Average Max Stdev Note
Baseline 1.96W 2.01W 2.11W 30mW 1 USB-C, screen off, backlight off, no radios
2 USB-C 1.95W 2.16W 3.69W 430mW USB-C confirmed as mostly passive...
3 USB-C 1.95W 2.16W 3.69W 430mW ... although with extra stdev
1TB SSD 3.72W 3.85W 4.62W 200mW unchanged from before upgrade
1 USB-A 1.97W 2.18W 4.02W 530mW unchanged
2 USB-A 1.97W 2.00W 2.08W 30mW unchanged
3 USB-A 1.94W 1.99W 2.03W 20mW unchanged
MicroSD w/o card 3.54W 3.58W 3.71W 40mW significant improvement! 2-3W power saving!
MicroSD w/ card 3.53W 3.72W 5.23W 370mW new measurement! increased deviation
DisplayPort 2.28W 2.31W 2.37W 20mW unchanged
1 HDMI 2.43W 2.69W 4.53W 460mW unchanged
2 HDMI 2.53W 2.59W 2.67W 30mW unchanged
External USB 3.85W 3.89W 3.94W 30mW new result
Ethernet 3.60W 3.70W 4.91W 230mW unchanged
Note that the table summary is different than the previous table: here we show the absolute numbers while the previous table was doing a confusing attempt at showing relative (to the baseline) numbers. Conclusion: the 3.06 BIOS update did not significantly change idle power usage stats except for the MicroSD card which has significantly improved. The new "external USB" test is also interesting: it shows how the provided 1TB SSD card performs (admirably) compared to existing devices. The other new result is the MicroSD card with a card which, interestingly, uses less power than the 1TB SSD drive.

Standby battery usage I wrote some quick hack to evaluate how much power is used during sleep. Apparently, this is one of the areas that should have improved since the first Framework model, let's find out. My baseline for comparison is the Purism laptop, which, in 10 minutes, went from this:
sep 28 11:19:45 angela systemd-sleep[209379]: /sys/class/power_supply/BAT/charge_now                      =   6045 [mAh]
... to this:
sep 28 11:29:47 angela systemd-sleep[209725]: /sys/class/power_supply/BAT/charge_now                      =   6037 [mAh]
That's 8mAh per 10 minutes (and 2 seconds), or 48mA, or, with this battery, about 127 hours or roughly 5 days of standby. Not bad! In comparison, here is my really old x220, before:
sep 29 22:13:54 emma systemd-sleep[176315]: /sys/class/power_supply/BAT0/energy_now                     =   5070 [mWh]
... after:
sep 29 22:23:54 emma systemd-sleep[176486]: /sys/class/power_supply/BAT0/energy_now                     =   4980 [mWh]
... which is 90 mwH in 10 minutes, or a whopping 540mA, which was possibly okay when this battery was new (62000 mAh, so about 100 hours, or about 5 days), but this battery is almost dead and has only 5210 mAh when full, so only 10 hours standby. And here is the Framework performing a similar test, before:
sep 29 22:27:04 angela systemd-sleep[4515]: /sys/class/power_supply/BAT1/charge_full                    =   3518 [mAh]
sep 29 22:27:04 angela systemd-sleep[4515]: /sys/class/power_supply/BAT1/charge_now                     =   2861 [mAh]
... after:
sep 29 22:37:08 angela systemd-sleep[4743]: /sys/class/power_supply/BAT1/charge_now                     =   2812 [mAh]
... which is 49mAh in a little over 10 minutes (and 4 seconds), or 292mA, much more than the Purism, but half of the X220. At this rate, the battery would last on standby only 12 hours!! That is pretty bad. Note that this was done with the following expansion cards:
  • 2 USB-C
  • 1 1TB SSD drive
  • 1 USB-A with a hub connected to it, with keyboard and LAN
Preliminary tests without the hub (over one minute) show that it doesn't significantly affect this power consumption (300mA). This guide also suggests booting with nvme.noacpi=1 but this still gives me about 5mAh/min (or 300mA). Adding mem_sleep_default=deep to the kernel command line does make a difference. Before:
sep 29 23:03:11 angela systemd-sleep[3699]: /sys/class/power_supply/BAT1/charge_now                     =   2544 [mAh]
... after:
sep 29 23:04:25 angela systemd-sleep[4039]: /sys/class/power_supply/BAT1/charge_now                     =   2542 [mAh]
... which is 2mAh in 74 seconds, which is 97mA, brings us to a more reasonable 36 hours, or a day and a half. It's still above the x220 power usage, and more than an order of magnitude more than the Purism laptop. It's also far from the 0.4% promised by upstream, which would be 14mA for the 3500mAh battery. It should also be noted that this "deep" sleep mode is a little more disruptive than regular sleep. As you can see by the timing, it took more than 10 seconds for the laptop to resume, which feels a little alarming as your banging the keyboard to bring it back to life. You can confirm the current sleep mode with:
# cat /sys/power/mem_sleep
s2idle [deep]
In the above, deep is selected. You can change it on the fly with:
printf s2idle > /sys/power/mem_sleep
Here's another test:
sep 30 22:25:50 angela systemd-sleep[32207]: /sys/class/power_supply/BAT1/charge_now                     =   1619 [mAh]
sep 30 22:31:30 angela systemd-sleep[32516]: /sys/class/power_supply/BAT1/charge_now                     =   1613 [mAh]
... better! 6 mAh in about 6 minutes, works out to 63.5mA, so more than two days standby. A longer test:
oct 01 09:22:56 angela systemd-sleep[62978]: /sys/class/power_supply/BAT1/charge_now                     =   3327 [mAh]
oct 01 12:47:35 angela systemd-sleep[63219]: /sys/class/power_supply/BAT1/charge_now                     =   3147 [mAh]
That's 180mAh in about 3.5h, 52mA! Now at 66h, or almost 3 days. I wasn't sure why I was seeing such fluctuations in those tests, but as it turns out, expansion card power tests show that they do significantly affect power usage, especially the SSD drive, which can take up to two full watts of power even when idle. I didn't control for expansion cards in the above tests running them with whatever card I had plugged in without paying attention so it's likely the cause of the high power usage and fluctuations. It might be possible to work around this problem by disabling USB devices before suspend. TODO. See also this post. In the meantime, I have been able to get much better suspend performance by unplugging all modules. Then I get this result:
oct 04 11:15:38 angela systemd-sleep[257571]: /sys/class/power_supply/BAT1/charge_now                     =   3203 [mAh]
oct 04 15:09:32 angela systemd-sleep[257866]: /sys/class/power_supply/BAT1/charge_now                     =   3145 [mAh]
Which is 14.8mA! Almost exactly the number promised by Framework! With a full battery, that means a 10 days suspend time. This is actually pretty good, and far beyond what I was expecting when starting down this journey. So, once the expansion cards are unplugged, suspend power usage is actually quite reasonable. More detailed standby tests are available in the standby-tests page, with a summary below. There is also some hope that the Chromebook edition specifically designed with a specification of 14 days standby time could bring some firmware improvements back down to the normal line. Some of those issues were reported upstream in April 2022, but there doesn't seem to have been any progress there since. TODO: one final solution here is suspend-then-hibernate, which Windows uses for this TODO: consider implementing the S0ix sleep states , see also troubleshooting TODO: consider https://github.com/intel/pm-graph

Standby expansion cards test results This table is a summary of the more extensive standby-tests I have performed:
Device Wattage Amperage Days Note
baseline 0.25W 16mA 9 sleep=deep nvme.noacpi=1
s2idle 0.29W 18.9mA ~7 sleep=s2idle nvme.noacpi=1
normal nvme 0.31W 20mA ~7 sleep=s2idle without nvme.noacpi=1
1 USB-C 0.23W 15mA ~10
2 USB-C 0.23W 14.9mA same as above
1 USB-A 0.75W 48.7mA 3 +500mW (!!) for the first USB-A card!
2 USB-A 1.11W 72mA 2 +360mW
3 USB-A 1.48W 96mA <2 +370mW
1TB SSD 0.49W 32mA <5 +260mW
MicroSD 0.52W 34mA ~4 +290mW
DisplayPort 0.85W 55mA <3 +620mW (!!)
1 HDMI 0.58W 38mA ~4 +250mW
2 HDMI 0.65W 42mA <4 +70mW (?)
Conclusions:
  • USB-C cards take no extra power on suspend, possibly less than empty slots, more testing required
  • USB-A cards take a lot more power on suspend (300-500mW) than on regular idle (~10mW, almost negligible)
  • 1TB SSD and MicroSD cards seem to take a reasonable amount of power (260-290mW), compared to their runtime equivalents (1-6W!)
  • DisplayPort takes a surprising lot of power (620mW), almost double its average runtime usage (390mW)
  • HDMI cards take, surprisingly, less power (250mW) in standby than the DP card (620mW)
  • and oddly, a second card adds less power usage (70mW?!) than the first, maybe a circuit is used by both?
A discussion of those results is in this forum post.

Standby expansion cards test results, 3.06 beta BIOS Framework recently (2022-11-07) announced that they will publish a firmware upgrade to address some of the USB-C issues, including power management. This could positively affect the above result, improving both standby and runtime power usage. The update came out in December 2022 and I redid my analysis with the following results:
Device Wattage Amperage Days Note
baseline 0.25W 16mA 9 no cards, same as before upgrade
1 USB-C 0.25W 16mA 9 same as before
2 USB-C 0.25W 16mA 9 same
1 USB-A 0.80W 62mA 3 +550mW!! worse than before
2 USB-A 1.12W 73mA <2 +320mW, on top of the above, bad!
Ethernet 0.62W 40mA 3-4 new result, decent
1TB SSD 0.52W 34mA 4 a bit worse than before (+2mA)
MicroSD 0.51W 22mA 4 same
DisplayPort 0.52W 34mA 4+ upgrade improved by 300mW
1 HDMI ? 38mA ? same
2 HDMI ? 45mA ? a bit worse than before (+3mA)
Normal 1.08W 70mA ~2 Ethernet, 2 USB-C, USB-A
Full results in standby-tests-306. The big takeaway for me is that the update did not improve power usage on the USB-A ports which is a big problem for my use case. There is a notable improvement on the DisplayPort power consumption which brings it more in line with the HDMI connector, but it still doesn't properly turn off on suspend either. Even worse, the USB-A ports now sometimes fails to resume after suspend, which is pretty annoying. This is a known problem that will hopefully get fixed in the final release.

Battery wear protection The BIOS has an option to limit charge to 80% to mitigate battery wear. There's a way to control the embedded controller from runtime with fw-ectool, partly documented here. The command would be:
sudo ectool fwchargelimit 80
I looked at building this myself but failed to run it. I opened a RFP in Debian so that we can ship this in Debian, and also documented my work there. Note that there is now a counter that tracks charge/discharge cycles. It's visible in tlp-stat -b, which is a nice improvement:
root@angela:/home/anarcat# tlp-stat -b
--- TLP 1.5.0 --------------------------------------------
+++ Battery Care
Plugin: generic
Supported features: none available
+++ Battery Status: BAT1
/sys/class/power_supply/BAT1/manufacturer                   = NVT
/sys/class/power_supply/BAT1/model_name                     = Framewo
/sys/class/power_supply/BAT1/cycle_count                    =      3
/sys/class/power_supply/BAT1/charge_full_design             =   3572 [mAh]
/sys/class/power_supply/BAT1/charge_full                    =   3541 [mAh]
/sys/class/power_supply/BAT1/charge_now                     =   1625 [mAh]
/sys/class/power_supply/BAT1/current_now                    =    178 [mA]
/sys/class/power_supply/BAT1/status                         = Discharging
/sys/class/power_supply/BAT1/charge_control_start_threshold = (not available)
/sys/class/power_supply/BAT1/charge_control_end_threshold   = (not available)
Charge                                                      =   45.9 [%]
Capacity                                                    =   99.1 [%]
One thing that is still missing is the charge threshold data (the (not available) above). There's been some work to make that accessible in August, stay tuned? This would also make it possible implement hysteresis support.

Ethernet expansion card The Framework ethernet expansion card is a fancy little doodle: "2.5Gbit/s and 10/100/1000Mbit/s Ethernet", the "clear housing lets you peek at the RTL8156 controller that powers it". Which is another way to say "we didn't completely finish prod on this one, so it kind of looks like we 3D-printed this in the shop".... The card is a little bulky, but I guess that's inevitable considering the RJ-45 form factor when compared to the thin Framework laptop. I have had a serious issue when trying it at first: the link LEDs just wouldn't come up. I made a full bug report in the forum and with upstream support, but eventually figured it out on my own. It's (of course) a power saving issue: if you reboot the machine, the links come up when the laptop is running the BIOS POST check and even when the Linux kernel boots. I first thought that the problem is likely related to the powertop service which I run at boot time to tweak some power saving settings. It seems like this:
echo 'on' > '/sys/bus/usb/devices/4-2/power/control'
... is a good workaround to bring the card back online. You can even return to power saving mode and the card will still work:
echo 'auto' > '/sys/bus/usb/devices/4-2/power/control'
Further research by Matt_Hartley from the Framework Team found this issue in the tlp tracker that shows how the USB_AUTOSUSPEND setting enables the power saving even if the driver doesn't support it, which, in retrospect, just sounds like a bad idea. To quote that issue:
By default, USB power saving is active in the kernel, but not force-enabled for incompatible drivers. That is, devices that support suspension will suspend, drivers that do not, will not.
So the fix is actually to uninstall tlp or disable that setting by adding this to /etc/tlp.conf:
USB_AUTOSUSPEND=0
... but that disables auto-suspend on all USB devices, which may hurt other power usage performance. I have found that a a combination of:
USB_AUTOSUSPEND=1
USB_DENYLIST="0bda:8156"
and this on the kernel commandline:
usbcore.quirks=0bda:8156:k
... actually does work correctly. I now have this in my /etc/default/grub.d/framework-tweaks.cfg file:
# net.ifnames=0: normal interface names ffs (e.g. eth0, wlan0, not wlp166
s0)
# nvme.noacpi=1: reduce SSD disk power usage (not working)
# mem_sleep_default=deep: reduce power usage during sleep (not working)
# usbcore.quirk is a workaround for the ethernet card suspend bug: https:
//guides.frame.work/Guide/Fedora+37+Installation+on+the+Framework+Laptop/
108?lang=en
GRUB_CMDLINE_LINUX="net.ifnames=0 nvme.noacpi=1 mem_sleep_default=deep usbcore.quirks=0bda:8156:k"
# fix the resolution in grub for fonts to not be tiny
GRUB_GFXMODE=1024x768
Other than that, I haven't been able to max out the card because I don't have other 2.5Gbit/s equipment at home, which is strangely satisfying. But running against my Turris Omnia router, I could pretty much max a gigabit fairly easily:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.09 GBytes   937 Mbits/sec  238             sender
[  5]   0.00-10.00  sec  1.09 GBytes   934 Mbits/sec                  receiver
The card doesn't require any proprietary firmware blobs which is surprising. Other than the power saving issues, it just works. In my power tests (see powerstat-wayland), the Ethernet card seems to use about 1.6W of power idle, without link, in the above "quirky" configuration where the card is functional but without autosuspend.

Proprietary firmware blobs The framework does need proprietary firmware to operate. Specifically:
  • the WiFi network card shipped with the DIY kit is a AX210 card that requires a 5.19 kernel or later, and the firmware-iwlwifi non-free firmware package
  • the Bluetooth adapter also loads the firmware-iwlwifi package (untested)
  • the graphics work out of the box without firmware, but certain power management features come only with special proprietary firmware, normally shipped in the firmware-misc-nonfree but currently missing from the package
Note that, at the time of writing, the latest i915 firmware from linux-firmware has a serious bug where loading all the accessible firmware results in noticeable I estimate 200-500ms lag between the keyboard (not the mouse!) and the display. Symptoms also include tearing and shearing of windows, it's pretty nasty. One workaround is to delete the two affected firmware files:
cd /lib/firmware && rm adlp_guc_70.1.1.bin adlp_guc_69.0.3.bin
update-initramfs -u
You will get the following warning during build, which is good as it means the problematic firmware is disabled:
W: Possible missing firmware /lib/firmware/i915/adlp_guc_69.0.3.bin for module i915
W: Possible missing firmware /lib/firmware/i915/adlp_guc_70.1.1.bin for module i915
But then it also means that critical firmware isn't loaded, which means, among other things, a higher battery drain. I was able to move from 8.5-10W down to the 7W range after making the firmware work properly. This is also after turning the backlight all the way down, as that takes a solid 2-3W in full blast. The proper fix is to use some compositing manager. I ended up using compton with the following systemd unit:
[Unit]
Description=start compositing manager
PartOf=graphical-session.target
ConditionHost=angela
[Service]
Type=exec
ExecStart=compton --show-all-xerrors --backend glx --vsync opengl-swc
Restart=on-failure
[Install]
RequiredBy=graphical-session.target
compton is orphaned however, so you might be tempted to use picom instead, but in my experience the latter uses much more power (1-2W extra, similar experience). I also tried compiz but it would just crash with:
anarcat@angela:~$ compiz --replace
compiz (core) - Warn: No XI2 extension
compiz (core) - Error: Another composite manager is already running on screen: 0
compiz (core) - Fatal: No manageable screens found on display :0
When running from the base session, I would get this instead:
compiz (core) - Warn: No XI2 extension
compiz (core) - Error: Couldn't load plugin 'ccp'
compiz (core) - Error: Couldn't load plugin 'ccp'
Thanks to EmanueleRocca for figuring all that out. See also this discussion about power management on the Framework forum. Note that Wayland environments do not require any special configuration here and actually work better, see my Wayland migration notes for details.
Also note that the iwlwifi firmware also looks incomplete. Even with the package installed, I get those errors in dmesg:
[   19.534429] Intel(R) Wireless WiFi driver for Linux
[   19.534691] iwlwifi 0000:a6:00.0: enabling device (0000 -> 0002)
[   19.541867] iwlwifi 0000:a6:00.0: firmware: failed to load iwlwifi-ty-a0-gf-a0-72.ucode (-2)
[   19.541881] iwlwifi 0000:a6:00.0: firmware: failed to load iwlwifi-ty-a0-gf-a0-72.ucode (-2)
[   19.541882] iwlwifi 0000:a6:00.0: Direct firmware load for iwlwifi-ty-a0-gf-a0-72.ucode failed with error -2
[   19.541890] iwlwifi 0000:a6:00.0: firmware: failed to load iwlwifi-ty-a0-gf-a0-71.ucode (-2)
[   19.541895] iwlwifi 0000:a6:00.0: firmware: failed to load iwlwifi-ty-a0-gf-a0-71.ucode (-2)
[   19.541896] iwlwifi 0000:a6:00.0: Direct firmware load for iwlwifi-ty-a0-gf-a0-71.ucode failed with error -2
[   19.541903] iwlwifi 0000:a6:00.0: firmware: failed to load iwlwifi-ty-a0-gf-a0-70.ucode (-2)
[   19.541907] iwlwifi 0000:a6:00.0: firmware: failed to load iwlwifi-ty-a0-gf-a0-70.ucode (-2)
[   19.541908] iwlwifi 0000:a6:00.0: Direct firmware load for iwlwifi-ty-a0-gf-a0-70.ucode failed with error -2
[   19.541913] iwlwifi 0000:a6:00.0: firmware: failed to load iwlwifi-ty-a0-gf-a0-69.ucode (-2)
[   19.541916] iwlwifi 0000:a6:00.0: firmware: failed to load iwlwifi-ty-a0-gf-a0-69.ucode (-2)
[   19.541917] iwlwifi 0000:a6:00.0: Direct firmware load for iwlwifi-ty-a0-gf-a0-69.ucode failed with error -2
[   19.541922] iwlwifi 0000:a6:00.0: firmware: failed to load iwlwifi-ty-a0-gf-a0-68.ucode (-2)
[   19.541926] iwlwifi 0000:a6:00.0: firmware: failed to load iwlwifi-ty-a0-gf-a0-68.ucode (-2)
[   19.541927] iwlwifi 0000:a6:00.0: Direct firmware load for iwlwifi-ty-a0-gf-a0-68.ucode failed with error -2
[   19.541933] iwlwifi 0000:a6:00.0: firmware: failed to load iwlwifi-ty-a0-gf-a0-67.ucode (-2)
[   19.541937] iwlwifi 0000:a6:00.0: firmware: failed to load iwlwifi-ty-a0-gf-a0-67.ucode (-2)
[   19.541937] iwlwifi 0000:a6:00.0: Direct firmware load for iwlwifi-ty-a0-gf-a0-67.ucode failed with error -2
[   19.544244] iwlwifi 0000:a6:00.0: firmware: direct-loading firmware iwlwifi-ty-a0-gf-a0-66.ucode
[   19.544257] iwlwifi 0000:a6:00.0: api flags index 2 larger than supported by driver
[   19.544270] iwlwifi 0000:a6:00.0: TLV_FW_FSEQ_VERSION: FSEQ Version: 0.63.2.1
[   19.544523] iwlwifi 0000:a6:00.0: firmware: failed to load iwl-debug-yoyo.bin (-2)
[   19.544528] iwlwifi 0000:a6:00.0: firmware: failed to load iwl-debug-yoyo.bin (-2)
[   19.544530] iwlwifi 0000:a6:00.0: loaded firmware version 66.55c64978.0 ty-a0-gf-a0-66.ucode op_mode iwlmvm
Some of those are available in the latest upstream firmware package (iwlwifi-ty-a0-gf-a0-71.ucode, -68, and -67), but not all (e.g. iwlwifi-ty-a0-gf-a0-72.ucode is missing) . It's unclear what those do or don't, as the WiFi seems to work well without them. I still copied them in from the latest linux-firmware package in the hope they would help with power management, but I did not notice a change after loading them. There are also multiple knobs on the iwlwifi and iwlmvm drivers. The latter has a power_schmeme setting which defaults to 2 (balanced), setting it to 3 (low power) could improve battery usage as well, in theory. The iwlwifi driver also has power_save (defaults to disabled) and power_level (1-5, defaults to 1) settings. See also the output of modinfo iwlwifi and modinfo iwlmvm for other driver options.

Graphics acceleration After loading the latest upstream firmware and setting up a compositing manager (compton, above), I tested the classic glxgears. Running in a window gives me odd results, as the gears basically grind to a halt:
Running synchronized to the vertical refresh.  The framerate should be
approximately the same as the monitor refresh rate.
137 frames in 5.1 seconds = 26.984 FPS
27 frames in 5.4 seconds =  5.022 FPS
Ouch. 5FPS! But interestingly, once the window is in full screen, it does hit the monitor refresh rate:
300 frames in 5.0 seconds = 60.000 FPS
I'm not really a gamer and I'm not normally using any of that fancy graphics acceleration stuff (except maybe my browser does?). I installed intel-gpu-tools for the intel_gpu_top command to confirm the GPU was engaged when doing those simulations. A nice find. Other useful diagnostic tools include glxgears and glxinfo (in mesa-utils) and (vainfo in vainfo). Following to this post, I also made sure to have those settings in my about:config in Firefox, or, in user.js:
user_pref("media.ffmpeg.vaapi.enabled", true);
Note that the guide suggests many other settings to tweak, but those might actually be overkill, see this comment and its parents. I did try forcing hardware acceleration by setting gfx.webrender.all to true, but everything became choppy and weird. The guide also mentions installing the intel-media-driver package, but I could not find that in Debian. The Arch wiki has, as usual, an excellent reference on hardware acceleration in Firefox.

Chromium / Signal desktop bugs It looks like both Chromium and Signal Desktop misbehave with my compositor setup (compton + i3). The fix is to add a persistent flag to Chromium. In Arch, it's conveniently in ~/.config/chromium-flags.conf but that doesn't actually work in Debian. I had to put the flag in /etc/chromium.d/disable-compositing, like this:
export CHROMIUM_FLAGS="$CHROMIUM_FLAGS --disable-gpu-compositing"
It's possible another one of the hundreds of flags might fix this issue better, but I don't really have time to go through this entire, incomplete, and unofficial list (!?!). Signal Desktop is a similar problem, and doesn't reuse those flags (because of course it doesn't). Instead I had to rewrite the wrapper script in /usr/local/bin/signal-desktop to use this instead:
exec /usr/bin/flatpak run --branch=stable --arch=x86_64 org.signal.Signal --disable-gpu-compositing "$@"
This was mostly done in this Puppet commit. I haven't figured out the root of this problem. I did try using picom and xcompmgr; they both suffer from the same issue. Another Debian testing user on Wayland told me they haven't seen this problem, so hopefully this can be fixed by switching to wayland.

Graphics card hangs I believe I might have this bug which results in a total graphical hang for 15-30 seconds. It's fairly rare so it's not too disruptive, but when it does happen, it's pretty alarming. The comments on that bug report are encouraging though: it seems this is a bug in either mesa or the Intel graphics driver, which means many people have this problem so it's likely to be fixed. There's actually a merge request on mesa already (2022-12-29). It could also be that bug because the error message I get is actually:
Jan 20 12:49:10 angela kernel: Asynchronous wait on fence 0000:00:02.0:sway[104431]:cb0ae timed out (hint:intel_atomic_commit_ready [i915]) 
Jan 20 12:49:15 angela kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 12:0:00000000 
Jan 20 12:49:15 angela kernel: i915 0000:00:02.0: [drm] Resetting chip for stopped heartbeat on rcs0 
Jan 20 12:49:15 angela kernel: i915 0000:00:02.0: [drm] GuC firmware i915/adlp_guc_70.1.1.bin version 70.1 
Jan 20 12:49:15 angela kernel: i915 0000:00:02.0: [drm] HuC firmware i915/tgl_huc_7.9.3.bin version 7.9 
Jan 20 12:49:15 angela kernel: i915 0000:00:02.0: [drm] HuC authenticated 
Jan 20 12:49:15 angela kernel: i915 0000:00:02.0: [drm] GuC submission enabled 
Jan 20 12:49:15 angela kernel: i915 0000:00:02.0: [drm] GuC SLPC enabled
It's a solid 30 seconds graphical hang. Maybe the keyboard and everything else keeps working. The latter bug report is quite long, with many comments, but this one from January 2023 seems to say that Sway 1.8 fixed the problem. There's also an earlier patch to add an extra kernel parameter that supposedly fixes that too. There's all sorts of other workarounds in there, for example this:
echo "options i915 enable_dc=1 enable_guc_loading=1 enable_guc_submission=1 edp_vswing=0 enable_guc=2 enable_fbc=1 enable_psr=1 disable_power_well=0"   sudo tee /etc/modprobe.d/i915.conf
from this comment... So that one is unsolved, as far as the upstream drivers are concerned, but maybe could be fixed through Sway.

Weird USB hangs / graphical glitches I have had weird connectivity glitches better described in this post, but basically: my USB keyboard and mice (connected over a USB hub) drop keys, lag a lot or hang, and I get visual glitches. The fix was to tighten the screws around the CPU on the motherboard (!), which is, thankfully, a rather simple repair.

USB docks are hell Note that the monitors are hooked up to angela through a USB-C / Thunderbolt dock from Cable Matters, with the lovely name of 201053-SIL. It has issues, see this blog post for an in-depth discussion.

Shipping details I ordered the Framework in August 2022 and received it about a month later, which is sooner than expected because the August batch was late. People (including me) expected this to have an impact on the September batch, but it seems Framework have been able to fix the delivery problems and keep up with the demand. As of early 2023, their website announces that laptops ship "within 5 days". I have myself ordered a few expansion cards in November 2022, and they shipped on the same day, arriving 3-4 days later.

The supply pipeline There are basically 6 steps in the Framework shipping pipeline, each (except the last) accompanied with an email notification:
  1. pre-order
  2. preparing batch
  3. preparing order
  4. payment complete
  5. shipping
  6. (received)
This comes from the crowdsourced spreadsheet, which should be updated when the status changes here. I was part of the "third batch" of the 12th generation laptop, which was supposed to ship in September. It ended up arriving on my door step on September 27th, about 33 days after ordering. It seems current orders are not processed in "batches", but in real time, see this blog post for details on shipping.

Shipping trivia I don't know about the others, but my laptop shipped through no less than four different airplane flights. Here are the hops it took: I can't quite figure out how to calculate exactly how much mileage that is, but it's huge. The ride through Alaska is surprising enough but the bounce back through Winnipeg is especially weird. I guess the route happens that way because of Fedex shipping hubs. There was a related oddity when I had my Purism laptop shipped: it left from the west coast and seemed to enter on an endless, two week long road trip across the continental US.

Other resources

10 March 2023

Antoine Beaupr : how to audit for open services with iproute2

The computer world has a tendency of reinventing the wheel once in a while. I am not a fan of that process, but sometimes I just have to bite the bullet and adapt to change. This post explains how I adapted to one particular change: the netstat to sockstat transition. I used to do this to show which processes where listening on which port on a server:
netstat -anpe
It was a handy mnemonic as, in France, ANPE was the agency responsible for the unemployed (basically). That would list all sockets (-a), not resolve hostnames (-n, because it's slow), show processes attached to the socket (-p) with extra info like the user (-e). This still works, but sometimes fail to find the actual process hooked to the port. Plus, it lists a whole bunch of UNIX sockets and non-listening sockets, which are generally irrelevant for such an audit. What I really wanted to use was really something like:
netstat -pleunt   sort
... which has the "pleut" mnemonic ("rains", but plural, which makes no sense and would be badly spelled anyway). That also only lists listening (-l) and network sockets, specifically UDP (-u) and TCP (-t). But enough with the legacy, let's try the brave new world of sockstat which has the unfortunate acronym ss. The equivalent sockstat command to the above is:
ss -pleuntO
It's similar to the above, except we need the -O flag otherwise ss does that confusing thing where it splits the output on multiple lines. But I actually use:
ss -plunt0
... i.e. without the -e as the information it gives (cgroup, fd number, etc) is not much more useful than what's already provided with -p (service and UID). All of the above also show sockets that are not actually a concern because they only listen on localhost. Those one should be filtered out. So now we embark into that wild filtering ride. This is going to list all open sockets and show the port number and service:
ss -pluntO --no-header   sed 's/^\([a-z]*\) *[A-Z]* *[0-9]* [0-9]* *[0-9]* */\1/'   sed 's/^[^:]*:\(:\]:\)\?//;s/\([0-9]*\) *[^ ]*/\1\t/;s/,fd=[0-9]*//'   sort -gu
For example on my desktop, it looks like:
anarcat@angela:~$ sudo ss -pluntO --no-header   sed 's/^\([a-z]*\) *[A-Z]* *[0-9]* [0-9]* *[0-9]* */\1/'   sed 's/^[^:]*:\(:\]:\)\?//;s/\([0-9]*\) *[^ ]*/\1\t/;s/,fd=[0-9]*//'   sort -gu
          [::]:* users:(("unbound",pid=1864))        
22  users:(("sshd",pid=1830))           
25  users:(("master",pid=3150))        
53  users:(("unbound",pid=1864))        
323 users:(("chronyd",pid=1876))        
500 users:(("charon",pid=2817))        
631 users:(("cups-browsed",pid=2744))   
2628    users:(("dictd",pid=2825))          
4001    users:(("emacs",pid=3578))          
4500    users:(("charon",pid=2817))        
5353    users:(("avahi-daemon",pid=1423))  
6600    users:(("systemd",pid=3461))       
8384    users:(("syncthing",pid=232169))   
9050    users:(("tor",pid=2857))            
21027   users:(("syncthing",pid=232169))   
22000   users:(("syncthing",pid=232169))   
33231   users:(("syncthing",pid=232169))   
34953   users:(("syncthing",pid=232169))   
35770   users:(("syncthing",pid=232169))   
44944   users:(("syncthing",pid=232169))   
47337   users:(("syncthing",pid=232169))   
48903   users:(("mosh-client",pid=234126))  
52774   users:(("syncthing",pid=232169))   
52938   users:(("avahi-daemon",pid=1423))  
54029   users:(("avahi-daemon",pid=1423))  
anarcat@angela:~$
But that doesn't filter out the localhost stuff, lots of false positive (like emacs, above). And this is where it gets... not fun, as you need to match "localhost" but we don't resolve names, so you need to do some fancy pattern matching:
ss -pluntO --no-header   \
    sed 's/^\([a-z]*\) *[A-Z]* *[0-9]* [0-9]* *[0-9]* */\1/;s/^tcp//;s/^udp//'   \
    grep -v -e '^\[fe80::' -e '^127.0.0.1' -e '^\[::1\]' -e '^192\.' -e '^172\.'   \
    sed 's/^[^:]*:\(:\]:\)\?//;s/\([0-9]*\) *[^ ]*/\1\t/;s/,fd=[0-9]*//'  \
    sort -gu
This is kind of horrible, but it works, those are the actually open ports on my machine:
anarcat@angela:~$ sudo ss -pluntO --no-header           sed 's/^\([a-
z]*\) *[A-Z]* *[0-9]* [0-9]* *[0-9]* */\1/;s/^tcp//;s/^udp//'        
   grep -v -e '^\[fe80::' -e '^127.0.0.1' -e '^\[::1\]' -e '^192\.' -
e '^172\.'           sed 's/^[^:]*:\(:\]:\)\?//;s/\([0-9]*\) *[^ ]*/\
1\t/;s/,fd=[0-9]*//'          sort -gu
22  users:(("sshd",pid=1830))           
500 users:(("charon",pid=2817))        
631 users:(("cups-browsed",pid=2744))   
4500    users:(("charon",pid=2817))        
5353    users:(("avahi-daemon",pid=1423))  
6600    users:(("systemd",pid=3461))       
21027   users:(("syncthing",pid=232169))   
22000   users:(("syncthing",pid=232169))   
34953   users:(("syncthing",pid=232169))   
35770   users:(("syncthing",pid=232169))   
48903   users:(("mosh-client",pid=234126))  
52938   users:(("avahi-daemon",pid=1423))  
54029   users:(("avahi-daemon",pid=1423))
Surely there must be a better way. It turns out that lsof can do some of this, and it's relatively straightforward. This lists all listening TCP sockets:
lsof -iTCP -sTCP:LISTEN +c 15   grep -v localhost   sort
A shorter version from Adam Shand is:
lsof -i @localhost
... which basically replaces the grep -v localhost line. In theory, this would do the equivalent on UDP
lsof -iUDP -sUDP:^Idle
... but in reality, it looks like lsof on Linux can't figure out the state of a UDP socket:
lsof: no UDP state names available: UDP:^Idle
... which, honestly, I'm baffled by. It's strange because ss can figure out the state of those sockets, heck it's how -l vs -a works after all. So we need something else to show listening UDP sockets. The following actually looks pretty good after all:
ss -pluO
That will list localhost sockets of course, so we can explicitly ask ss to resolve those and filter them out with something like:
ss -plurO   grep -v localhost
oh, and look here! ss supports pattern matching, so we can actually tell it to ignore localhost directly, which removes that horrible sed line we used earlier:
ss -pluntO '! ( src = localhost )'
That actually gives a pretty readable output. One annoyance is we can't really modify the columns here, so we still need some god-awful sed hacking on top of that to get a cleaner output:
ss -nplutO '! ( src = localhost )'    \
    sed 's/\(udp\ tcp\).*:\([0-9][0-9]*\)/\2\t\1\t/;s/\([0-9][0-9]*\t[udtcp]*\t\)[^u]*users:(("/\1/;s/".*//;s/.*Address:Port.*/Netid\tPort\tProcess/'   \
    sort -nu
That looks horrible and is basically impossible to memorize. But it sure looks nice:
anarcat@angela:~$ sudo ss -nplutO '! ( src = localhost )'    sed 's/\(udp\ tcp\).*:\([0-9][0-9]*\)/\2\t\1\t/;s/\([0-9][0-9]*\t[udtcp]*\t\)[^u]*users:(("/\1/;s/".*//;s/.*Address:Port.*/Port\tNetid\tProcess/'   sort -nu
Port    Netid   Process
22  tcp sshd
500 udp charon
546 udp NetworkManager
631 udp cups-browsed
4500    udp charon
5353    udp avahi-daemon
6600    tcp systemd
21027   udp syncthing
22000   udp syncthing
34953   udp syncthing
35770   udp syncthing
48903   udp mosh-client
52938   udp avahi-daemon
54029   udp avahi-daemon
Better ideas welcome.

7 March 2023

Jonathan Dowland: Welcome Oblivion 10th Anniversary

I haven t done one of these for a while, and they ll be less frequent than I once planned as I m working from home less and less. I'm also trying to get back into exploring my digital music collection, and more generally engaging with digital music again.
 picture of a vinyl record
It s the ten year anniversary of the first (and last) LP by How To Destroy Angels (HTDA), the side-project of Trent Reznor with his wife, his Nine Inch Nails (NIN) partner in crime Atticus Ross and visual artist (and NIN artistic director) Rob Sheridan. This album was a real pleasure. For NIN fans, it wasn't clear what the future held after the start of HTDA. But this work really stood alone, similar in some ways to NIN but sufficiently different to be fresh and exciting. In stark contrast to NIN (at the time), it was interesting to see the members of HTDA presented on an equal footing, especially Rob Sheridan, who wasn't a musician. The intent was to try and put the visual work on the same level of esteem as the musical. HTDA performed a few live shows, but none outside the US. They were apparently quite a spectacle. As an artefact, this is a gorgeous LP. The gatefold cover and all four sides of the two record sleeves are covered in unique pieces of Sheridan's glitch art. When I originally bought this I had a rather generously-sized individual office at the University, so I framed and displayed many of these pieces on my office walls. Sheridan has since written extensively on the processes and techniques he used for this style of art, and has produced many more works using the same techniques. You can see some on his website, patreon, fine art print shop or Threadless store. Late last year I treated myself to a large print of some related work, analog(Oblivion)000b, which (once the framing is done) I'm going to hang in my home office. The LP had two tracks that were not present in the CD or digital release versions of the album, although a CD was bundled in the LP which included the tracks. (The Knife did something similar with Shaking the Habitual, at around the same time). I've had some multitrack stems from this album sitting in my "for archive.org" folder for a while, so I took the opportunity of the 10th anniversary to upload them, here: https://archive.org/details/htda_multitracks

1 March 2023

Debian XMPP Team: XMPP What's new in Debian 12 bookworm

On Tue 13 July 2021 there was a blog post of new XMPP related software releases which have been uploaded to Debian 11 (bullseye). Today, we will inform you about updates for the upcoming Debian release bookworm. A lot of new releases have been provided by the upstream projects. There were lot of changes to the XMPP clients like Dino, Gajim, Profanity, Poezio and others. Also the XMPP servers have been enhanced. Unfortunately, we can not provide a list of all the changes which have been done, but will try to highlight some of the changes and new features. BTW, feel free to join the Debian User Support on Jabber at xmpp:debian@conference.debian.org?join. You can find a list of 58 packages of the Debian XMPP team on the XMPP QA Page. Server Libs Others Happy chatting - keep in touch with your family and friends via Jabber / XMPP - XMPP is an open standard of the Internet Engineering Task Force (IETF) for instant messaging.

15 February 2023

Lukas M rdian: Netplan v0.106 is now available

I m happy to announce that Netplan version 0.106 is now available on GitHub and is soon to be deployed into an Ubuntu/Debian/Fedora installation near you! Six months and 65 commits after the previous version, this release is brought to you by 4 free software contributors from around the globe. Highlights Highlights of this release include the new netplan status command, which queries your system for IP addresses, routes, DNS information, etc in addition to the Netplan backend renderer (NetworkManager/networkd) in use and the relevant Netplan YAML configuration ID. It displays all this in a nicely formatted way (or alternatively in machine readable YAML/JSON format).
Furthermore, we implemented a clean libnetplan API which can be used by external tools to parse Netplan configuration, migrated away from non-inclusive language (PR#303) and improved the overall Netplan documentation. Another change that should be noted, is that the match.macaddress stanza now only matches on PermanentMACAddress= on the systemd-networkd backend, as has been the case on the NetworkManager backend ever since (see PR#278 for background information on this slight change in behavior). Changelog Bug fixes:

8 February 2023

Chris Lamb: Most anticipated films of 2023

Very few highly-anticipated movies appear in January and February, as the bigger releases are timed so they can be considered for the Golden Globes in January and the Oscars in late February or early March, so film fans have the advantage of a few weeks after the New Year to collect their thoughts on the year ahead. In other words, I'm not actually late in outlining below the films I'm most looking forward to in 2023...

Barbie No, seriously! If anyone can make a good film about a doll franchise, it's probably Greta Gerwig. Not only was Little Women (2019) more than admirable, the same could be definitely said for Lady Bird (2017). More importantly, I can't help feel she was the real 'Driver' behind Frances Ha (2012), one of the better modern takes on Claudia Weill's revelatory Girlfriends (1978). Still, whenever I remember that Barbie will be a film about a billion-dollar toy and media franchise with a nettlesome history, I recall I rubbished the "Facebook film" that turned into The Social Network (2010). Anyway, the trailer for Barbie is worth watching, if only because it seems like a parody of itself.

Blitz It's difficult to overstate just how important the aerial bombing of London during World War II is crucial to understanding the British psyche, despite it being a constructed phenomenon from the outset. Without wishing to underplay the deaths of over 40,000 civilian deaths, Angus Calder pointed out in the 1990s that the modern mythology surrounding the event "did not evolve spontaneously; it was a propaganda construct directed as much at [then neutral] American opinion as at British." It will therefore be interesting to see how British Grenadian Trinidadian director Steve McQueen addresses a topic so essential to the British self-conception. (Remember the controversy in right-wing circles about the sole Indian soldier in Christopher Nolan's Dunkirk (2017)?) McQueen is perhaps best known for his 12 Years a Slave (2013), but he recently directed a six-part film anthology for the BBC which addressed the realities of post-Empire immigration to Britain, and this leads me to suspect he sees the Blitz and its surrounding mythology with a more critical perspective. But any attempt to complicate the story of World War II will be vigorously opposed in a way that will make the recent hullabaloo surrounding The Crown seem tame. All this is to say that the discourse surrounding this release may be as interesting as the film itself.

Dune, Part II Coming out of the cinema after the first part of Denis Vileneve's adaptation of Dune (2021), I was struck by the conception that it was less of a fresh adaptation of the 1965 novel by Frank Herbert than an attempt to rehabilitate David Lynch's 1984 version and in a broader sense, it was also an attempt to reestablish the primacy of cinema over streaming TV and the myriad of other distractions in our lives. I must admit I'm not a huge fan of the original novel, finding within it a certain prurience regarding hereditary military regimes and writing about them with a certain sense of glee that belies a secret admiration for them... not to mention an eyebrow-raising allegory for the Middle East. Still, Dune, Part II is going to be a fantastic spectacle.

Ferrari It'll be curious to see how this differs substantially from the recent Ford v Ferrari (2019), but given that Michael Mann's Heat (1995) so effectively re-energised the gangster/heist genre, I'm more than willing to kick the tires of this about the founder of the eponymous car manufacturer. I'm in the minority for preferring Mann's Thief (1981) over Heat, in part because the former deals in more abstract themes, so I'd have perhaps prefered to look forward to a more conceptual film from Mann over a story about one specific guy.

How Do You Live There are a few directors one can look forward to watching almost without qualification, and Hayao Miyazaki (My Neighbor Totoro, Kiki's Delivery Service, Princess Mononoke Howl's Moving Castle, etc.) is one of them. And this is especially so given that The Wind Rises (2013) was meant to be the last collaboration between Miyazaki and Studio Ghibli. Let's hope he is able to come out of retirement in another ten years.

Indiana Jones and the Dial of Destiny Given I had a strong dislike of Indiana Jones and the Kingdom of the Crystal Skull (2008), I seriously doubt I will enjoy anything this film has to show me, but with 1981's Raiders of the Lost Ark remaining one of my most treasured films (read my brief homage), I still feel a strong sense of obligation towards the Indiana Jones name, despite it feeling like the copper is being pulled out of the walls of this franchise today.

Kafka I only know Polish filmmaker Agnieszka Holland through her Spoor (2017), an adaptation of Olga Tokarczuk's 2009 eco-crime novel Drive Your Plow Over the Bones of the Dead. I wasn't an unqualified fan of Spoor (nor the book on which it is based), but I am interested in Holland's take on the life of Czech author Franz Kafka, an author enmeshed with twentieth-century art and philosophy, especially that of central Europe. Holland has mentioned she intends to tell the story "as a kind of collage," and I can hope that it is an adventurous take on the over-furrowed biopic genre. Or perhaps Gregor Samsa will awake from uneasy dreams to find himself transformed in his bed into a huge verminous biopic.

The Killer It'll be interesting to see what path David Fincher is taking today, especially after his puzzling and strangely cold Mank (2020) portraying the writing process behind Orson Welles' Citizen Kane (1941). The Killer is said to be a straight-to-Netflix thriller based on the graphic novel about a hired assassin, which makes me think of Fincher's Zodiac (2007), and, of course, Se7en (1995). I'm not as entranced by Fincher as I used to be, but any film with Michael Fassbender and Tilda Swinton (with a score by Trent Reznor) is always going to get my attention.

Killers of the Flower Moon In Killers of the Flower Moon, Martin Scorsese directs an adaptation of a book about the FBI's investigation into a conspiracy to murder Osage tribe members in the early years of the twentieth century in order to deprive them of their oil-rich land. (The only thing more quintessentially American than apple pie is a conspiracy combined with a genocide.) Separate from learning more about this disquieting chapter of American history, I'd love to discover what attracted Scorsese to this particular story: he's one of the few top-level directors who have the ability to lucidly articulate their intentions and motivations.

Napoleon It often strikes me that, despite all of his achievements and fame, it's somehow still possible to claim that Ridley Scott is relatively underrated compared to other directors working at the top level today. Besides that, though, I'm especially interested in this film, not least of all because I just read Tolstoy's War and Peace (read my recent review) and am working my way through the mind-boggling 431-minute Soviet TV adaptation, but also because several auteur filmmakers (including Stanley Kubrick) have tried to make a Napoleon epic and failed.

Oppenheimer In a way, a biopic about the scientist responsible for the atomic bomb and the Manhattan Project seems almost perfect material for Christopher Nolan. He can certainly rely on stars to queue up to be in his movies (Robert Downey Jr., Matt Damon, Kenneth Branagh, etc.), but whilst I'm certain it will be entertaining on many fronts, I fear it will fall into the well-established Nolan mould of yet another single man struggling with obsession, deception and guilt who is trying in vain to balance order and chaos in the world.

The Way of the Wind Marked by philosophical and spiritual overtones, all of Terrence Malick's films are perfumed with themes of transcendence, nature and the inevitable conflict between instinct and reason. My particular favourite is his stunning Days of Heaven (1978), but The Thin Red Line (1998) and A Hidden Life (2019) also touched me ways difficult to relate, and are one of the few films about the Second World War that don't touch off my sensitivity about them (see my remarks about Blitz above). It is therefore somewhat Malickian that his next film will be a biblical drama about the life of Jesus. Given Malick's filmography, I suspect this will be far more subdued than William Wyler's 1959 Ben-Hur and significantly more equivocal in its conviction compared to Paolo Pasolini's ardently progressive The Gospel According to St. Matthew (1964). However, little beyond that can be guessed, and the film may not even appear until 2024 or even 2025.

Zone of Interest I was mesmerised by Jonathan Glazer's Under the Skin (2013), and there is much to admire in his borderline 'revisionist gangster' film Sexy Beast (2000), so I will definitely be on the lookout for this one. The only thing making me hesitate is that Zone of Interest is based on a book by Martin Amis about a romance set inside the Auschwitz concentration camp. I haven't read the book, but Amis has something of a history in his grappling with the history of the twentieth century, and he seems to do it in a way that never sits right with me. But if Paul Verhoeven's Starship Troopers (1997) proves anything at all, it's all in the adaption.

6 February 2023

Reproducible Builds: Reproducible Builds in January 2023

Welcome to the first report for 2023 from the Reproducible Builds project! In these reports we try and outline the most important things that we have been up to over the past month, as well as the most important things in/around the community. As a quick recap, the motivation behind the reproducible builds effort is to ensure no malicious flaws can be deliberately introduced during compilation and distribution of the software that we run on our devices. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website.


News In a curious turn of events, GitHub first announced this month that the checksums of various Git archives may be subject to change, specifically that because:
the default compression for Git archives has recently changed. As result, archives downloaded from GitHub may have different checksums even though the contents are completely unchanged.
This change (which was brought up on our mailing list last October) would have had quite wide-ranging implications for anyone wishing to validate and verify downloaded archives using cryptographic signatures. However, GitHub reversed this decision, updating their original announcement with a message that We are reverting this change for now. More details to follow. It appears that this was informed in part by an in-depth discussion in the GitHub Community issue tracker.
The Bundesamt f r Sicherheit in der Informationstechnik (BSI) (trans: The Federal Office for Information Security ) is the agency in charge of managing computer and communication security for the German federal government. They recently produced a report that touches on attacks on software supply-chains (Supply-Chain-Angriff). (German PDF)
Contributor Seb35 updated our website to fix broken links to Tails Git repository [ ][ ], and Holger updated a large number of pages around our recent summit in Venice [ ][ ][ ][ ].
Noak J nsson has written an interesting paper entitled The State of Software Diversity in the Software Supply Chain of Ethereum Clients. As the paper outlines:
In this report, the software supply chains of the most popular Ethereum clients are cataloged and analyzed. The dependency graphs of Ethereum clients developed in Go, Rust, and Java, are studied. These client are Geth, Prysm, OpenEthereum, Lighthouse, Besu, and Teku. To do so, their dependency graphs are transformed into a unified format. Quantitative metrics are used to depict the software supply chain of the blockchain. The results show a clear difference in the size of the software supply chain required for the execution layer and consensus layer of Ethereum.

Yongkui Han posted to our mailing list discussing making reproducible builds & GitBOM work together without gitBOM-ID embedding. GitBOM (now renamed to OmniBOR) is a project to enable automatic, verifiable artifact resolution across today s diverse software supply-chains [ ]. In addition, Fabian Keil wrote to us asking whether anyone in the community would be at Chemnitz Linux Days 2023, which is due to take place on 11th and 12th March (event info). Separate to this, Akihiro Suda posted to our mailing list just after the end of the month with a status report of bit-for-bit reproducible Docker/OCI images. As Akihiro mentions in their post, they will be giving a talk at FOSDEM in the Containers devroom titled Bit-for-bit reproducible builds with Dockerfile and that my talk will also mention how to pin the apt/dnf/apk/pacman packages with my repro-get tool.
The extremely popular Signal messenger app added upstream support for the SOURCE_DATE_EPOCH environment variable this month. This means that release tarballs of the Signal desktop client do not embed nondeterministic release information. [ ][ ]

Distribution work

F-Droid & Android There was a very large number of changes in the F-Droid and wider Android ecosystem this month: On January 15th, a blog post entitled Towards a reproducible F-Droid was published on the F-Droid website, outlining the reasons why F-Droid signs published APKs with its own keys and how reproducible builds allow using upstream developers keys instead. In particular:
In response to [ ] criticisms, we started encouraging new apps to enable reproducible builds. It turns out that reproducible builds are not so difficult to achieve for many apps. In the past few months we ve gotten many more reproducible apps in F-Droid than before. Currently we can t highlight which apps are reproducible in the client, so maybe you haven t noticed that there are many new apps signed with upstream developers keys.
(There was a discussion about this post on Hacker News.) In addition:
  • F-Droid added 13 apps published with reproducible builds this month. [ ]
  • FC Stegerman outlined a bug where baseline.profm files are nondeterministic, developed a workaround, and provided all the details required for a fix. As they note, this issue has now been fixed but the fix is not yet part of an official Android Gradle plugin release.
  • GitLab user Parwor discovered that the number of CPU cores can affect the reproducibility of .dex files. [ ]
  • FC Stegerman also announced the 0.2.0 and 0.2.1 releases of reproducible-apk-tools, a suite of tools to help make .apk files reproducible. Several new subcommands and scripts were added, and a number of bugs were fixed as well [ ][ ]. They also updated the F-Droid website to improve the reproducibility-related documentation. [ ][ ]
  • On the F-Droid issue tracker, FC Stegerman discussed reproducible builds with one of the developers of the Threema messenger app and reported that Android SDK build-tools 31.0.0 and 32.0.0 (unlike earlier and later versions) have a zipalign command that produces incorrect padding.
  • A number of bugs related to reproducibility were discovered in Android itself. Firstly, the non-deterministic order of .zip entries in .apk files [ ] and then newline differences between building on Windows versus Linux that can make builds not reproducible as well. [ ] (Note that these links may require a Google account to view.)
  • And just before the end of the month, FC Stegerman started a thread on our mailing list on the topic of hiding data/code in APK embedded signatures which has been made possible by the Android APK Signature Scheme v2/v3. As part of this, they made an Android app that reads the APK Signing block of its own APK and extracts a payload in order to alter its behaviour called sigblock-code-poc.

Debian As mentioned in last month s report, Vagrant Cascadian has been organising a series of online sprints in order to clear the huge backlog of reproducible builds patches submitted by performing NMUs (Non-Maintainer Uploads). During January, a sprint took place on the 10th, resulting in the following uploads: During this sprint, Holger Levsen filed Debian bug #1028615 to request that the tracker.debian.org service display results of reproducible rebuilds, not just reproducible CI results. Elsewhere in Debian, strip-nondeterminism is our tool to remove specific non-deterministic results from a completed build. This month, version 1.13.1-1 was uploaded to Debian unstable by Holger Levsen, including a fix by FC Stegerman (obfusk) to update a regular expression for the latest version of file(1) [ ]. (#1028892) Lastly, 65 reviews of Debian packages were added, 21 were updated and 35 were removed this month adding to our knowledge about identified issues.

Other distributions In other distributions:

diffoscope diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb made the following changes to diffoscope, including preparing and uploading versions 231, 232, 233 and 234 to Debian:
  • No need for from __future__ import print_function import anymore. [ ]
  • Comment and tidy the extras_require.json handling. [ ]
  • Split inline Python code to generate test Recommends into a separate Python script. [ ]
  • Update debian/tests/control after merging support for PyPDF support. [ ]
  • Correctly catch segfaulting cd-iccdump binary. [ ]
  • Drop some old debugging code. [ ]
  • Allow ICC tests to (temporarily) fail. [ ]
In addition, FC Stegerman (obfusk) made a number of changes, including:
  • Updating the test_text_proper_indentation test to support the latest version(s) of file(1). [ ]
  • Use an extras_require.json file to store some build/release metadata, instead of accessing the internet. [ ]
  • Updating an APK-related file(1) regular expression. [ ]
  • On the diffoscope.org website, de-duplicate contributors by e-mail. [ ]
Lastly, Sam James added support for PyPDF version 3 [ ] and Vagrant Cascadian updated a handful of tool references for GNU Guix. [ ][ ]

Upstream patches The Reproducible Builds project attempts to fix as many currently-unreproducible packages as possible. This month, we wrote a large number of such patches, including:

Testing framework The Reproducible Builds project operates a comprehensive testing framework at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In January, the following changes were made by Holger Levsen:
  • Node changes:
  • Debian-related changes:
    • Only keep diffoscope s HTML output (ie. no .json or .txt) for LTS suites and older in order to save diskspace on the Jenkins host. [ ]
    • Re-create pbuilder base less frequently for the stretch, bookworm and experimental suites. [ ]
  • OpenWrt-related changes:
    • Add gcc-multilib to OPENWRT_HOST_PACKAGES and install it on the nodes that need it. [ ]
    • Detect more problems in the health check when failing to build OpenWrt. [ ]
  • Misc changes:
    • Update the chroot-run script to correctly manage /dev and /dev/pts. [ ][ ][ ]
    • Update the Jenkins shell monitor script to collect disk stats less frequently [ ] and to include various directory stats. [ ][ ]
    • Update the real year in the configuration in order to be able to detect whether a node is running in the future or not. [ ]
    • Bump copyright years in the default page footer. [ ]
In addition, Christian Marangi submitted a patch to build OpenWrt packages with the V=s flag to enable debugging. [ ]
If you are interested in contributing to the Reproducible Builds project, please visit the Contribute page on our website. You can get in touch with us via:

5 February 2023

Jonathan Dowland: 2022 in reading

In 2022 I read 34 books (-19% on last year). In 2021 roughly a quarter of the books I read were written by women. I was determined to push that ratio in 2022, so I made an effort to try and only read books by women. I knew that I wouldn't manage that, but by trying to, I did get the ratio up to 58% (by page count). I'm not sure what will happen in 2023. My to-read pile has some back-pressure from books by male authors I postponed reading in 2022 (in particular new works by Christopher Priest and Adam Roberts). It's possible the ratio will swing back the other way, which would mean it would not be worth repeating the experiment. At least if the ratio is the point of the exercise. But perhaps it isn't: perhaps the useful outcome is more qualitative than quantitative. I tried to read some new (to me) authors. I really enjoyed Shirley Jackson (The Haunting of Hill House, We Have Always Lived In The Castle). I Struggled with Angela Carter's Heroes and Villains although I plan to return to her other work, in particular, The Bloody Chamber. I also got through Donna Tartt's The Secret History on the recommendation of a friend. I had to push through the first 15% or so but it turned out to be worth it.
a book cover for Shirley Jackson's 'We have always lived in the castle'
a book cover for Margaret Atwood's 'The Handmaid's Tale'
a book cover for Adam Roberts' 'The This'
a book cover for Emily St. John Mandel's 'Sea of Tranquility'

I finally read (and loved) The Handmaid's Tale, which I had never read despite loving Atwood. My top non-fiction book was The Nanny State Made Me by Stuart Maconie. I still read far more fiction than non-fiction. Or perhaps I'm not keeping track of non- fiction as well. I feel non-fiction requires a different approach to reading: not necessarily linear; it's not always important to read the whole book; it's often important to re-read sections. It might not make sense to consider them in the same bracket. My favourite novels this year were Sea of Tranquility by Emily St. John Mandel, a standalone sort-of sequel to The Glass House but in a very different genre; and The This by Adam Roberts, which was equally remarkable. The This has an interesting narrative device in the first third where a stream of tweets is presented in parallel with the main text. This works well, and does a good job of capturing the figurative river of tweet-like stuff that is woven into our lives at the moment. But I can't help but wonder how they tackle that in the audiobook.

2 February 2023

John Goerzen: Using Yggdrasil As an Automatic Mesh Fabric to Connect All Your Docker Containers, VMs, and Servers

Sometimes you might want to run Docker containers on more than one host. Maybe you want to run some at one hosting facility, some at another, and so forth. Maybe you d like run VMs at various places, and let them talk to Docker containers and bare metal servers wherever they are. And maybe you d like to be able to easily migrate any of these from one provider to another. There are all sorts of very complicated ways to set all this stuff up. But there s also a simple one: Yggdrasil. My blog post Make the Internet Yours Again With an Instant Mesh Network explains some of the possibilities of Yggdrasil in general terms. Here I want to show you how to use Yggdrasil to solve some of these issues more specifically. Because Yggdrasil is always Encrypted, some of the security lifting is done for us.

Background Often in Docker, we connect multiple containers to a single network that runs on a given host. That much is easy. Once you start talking about containers on multiple hosts, then you start adding layers and layers of complexity. Once you start talking multiple providers, maybe multiple continents, then the complexity can increase. And, if you want to integrate everything from bare metal servers to VMs into this well, there are ways, but they re not easy. I m a believer in the KISS principle. Let s not make things complex when we don t have to.

Enter Yggdrasil As I ve explained before, Yggdrasil can automatically form a global mesh network. This is pretty cool! As most people use it, they join it to the main Yggdrasil network. But Yggdrasil can be run entirely privately as well. You can run your own private mesh, and that s what we ll talk about here. All we have to do is run Yggdrasil inside each container, VM, server, or whatever. We handle some basics of connectivity, and bam! Everything is host- and location-agnostic.

Setup in Docker The installation of Yggdrasil on a regular system is pretty straightforward. Docker is a bit more complicated for several reasons:
  • It blocks IPv6 inside containers by default
  • The default set of permissions doesn t permit you to set up tunnels inside a container
  • It doesn t typically pass multicast (broadcast) packets
Normally, Yggdrasil could auto-discover peers on a LAN interface. However, aside from some esoteric Docker networking approaches, Docker doesn t permit that. So my approach is going to be setting up one or more Yggdrasil router containers on a given Docker host. All the other containers talk directly to the router container and it s all good.

Basic installation In my Dockerfile, I have something like this:
FROM jgoerzen/debian-base-security:bullseye
RUN echo "deb http://deb.debian.org/debian bullseye-backports main" >> /etc/apt/sources.list && \
    apt-get --allow-releaseinfo-change update && \
    apt-get -y --no-install-recommends -t bullseye-backports install yggdrasil
...
COPY yggdrasil.conf /etc/yggdrasil/
RUN set -x; \
    chown root:yggdrasil /etc/yggdrasil/yggdrasil.conf && \
    chmod 0750 /etc/yggdrasil/yggdrasil.conf && \
    systemctl enable yggdrasil
The magic parameters to docker run to make Yggdrasil work are:
--cap-add=NET_ADMIN --sysctl net.ipv6.conf.all.disable_ipv6=0 --device=/dev/net/tun:/dev/net/tun
This example uses my docker-debian-base images, so if you use them as well, you ll also need to add their parameters. Note that it is NOT necessary to use --privileged. In fact, due to the network namespaces in use in Docker, this command does not let the container modify the host s networking (unless you use --net=host, which I do not recommend). The --sysctl parameter was the result of a lot of banging my head against the wall. Apparently Docker tries to disable IPv6 in the container by default. Annoying.

Configuration of the router container(s) The idea is that the router node (or more than one, if you want redundancy) will be the only ones to have an open incoming port. Although the normal Yggdrasil case of directly detecting peers in a broadcast domain is more convenient and more robust, this can work pretty well too. You can, of course, generate a template yggdrasil.conf with yggdrasil -genconf like usual. Some things to note for this one:
  • You ll want to change Listen to something like Listen: ["tls://[::]:12345"] where 12345 is the port number you ll be listening on.
  • You ll want to disable the MulticastInterfaces entirely by just setting it to [] since it doesn t work anyway.
  • If you expose the port to the Internet, you ll certainly want to firewall it to only authorized peers. Setting AllowedPublicKeys is another useful step.
  • If you have more than one router container on a host, each of them will both Listen and act as a client to the others. See below.

Configuration of the non-router nodes Again, you can start with a simple configuration. Some notes here:
  • You ll want to set Peers to something like Peers: ["tls://routernode:12345"] where routernode is the Docker hostname of the router container, and 12345 is its port number as defined above. If you have more than one local router container, you can simply list them all here. Yggdrasil will then fail over nicely if any one of them go down.
  • Listen should be empty.
  • As above, MulticastInterfaces should be empty.

Using the interfaces At this point, you should be able to ping6 between your containers. If you have multiple hosts running Docker, you can simply set up the router nodes on each to connect to each other. Now you have direct, secure, container-to-container communication that is host-agnostic! You can also set up Yggdrasil on a bare metal server or VM using standard procedures and everything will just talk nicely!

Security notes Yggdrasil s mesh is aggressively greedy. It will peer with any node it can find (unless told otherwise) and will find a route to anywhere it can. There are two main ways to make sure your internal comms stay private: by restricting who can talk to your mesh, and by firewalling the Yggdrasil interface. Both can be used, and they can be used simultaneously. By disabling multicast discovery, you eliminate the chance for random machines on the LAN to join the mesh. By making sure that you firewall off (outside of Yggdrasil) who can connect to a Yggdrasil node with a listening port, you can authorize only your own machines. And, by setting AllowedPublicKeys on the nodes with listening ports, you can authenticate the Yggdrasil peers. Note that part of the benefit of the Yggdrasil mesh is normally that you don t have to propagate a configuration change to every participatory node that s a nice thing in general! You can also run a firewall inside your container (I like firehol for this purpose) and aggressively firewall the IPs that are allowed to connect via the Yggdrasil interface. I like to set a stable interface name like ygg0 in yggdrasil.conf, and then it becomes pretty easy to firewall the services. The Docker parameters that allow Yggdrasil to run are also sufficient to run firehol.

Naming Yggdrasil peers You probably don t want to hard-code Yggdrasil IPs all over the place. There are a few solutions:
  • You could run an internal DNS service
  • You can do a bit of scripting around Docker s --add-host command to add things to /etc/hosts

Other hints & conclusion Here are some other helpful use cases:
  • If you are migrating between hosts, you could leave your reverse proxy up at both hosts, both pointing to the target containers over Yggdrasil. The targets will be automatically found from both sides of the migration while you wait for DNS caches to update and such.
  • This can make services integrate with local networks a lot more painlessly than they might otherwise.
This is just an idea. The point of Yggdrasil is expanding our ideas of what we can do with a network, so here s one such expansion. Have fun!
Note: This post also has a permanent home on my webiste, where it may be periodically updated.

1 February 2023

Julian Andres Klode: Ubuntu 2022v1 secure boot key rotation and friends

This is the story of the currently progressing changes to secure boot on Ubuntu and the history of how we got to where we are.

taking a step back: how does secure boot on Ubuntu work? Booting on Ubuntu involves three components after the firmware:
  1. shim
  2. grub
  3. linux
Each of these is a PE binary signed with a key. The shim is signed by Microsoft s 3rd party key and embeds a self-signed Canonical CA certificate, and optionally a vendor dbx (a list of revoked certificates or binaries). grub and linux (and fwupd) are then signed by a certificate issued by that CA In Ubuntu s case, the CA certificate is sharded: Multiple people each have a part of the key and they need to meet to be able to combine it and sign things, such as new code signing certificates.

BootHole When BootHole happened in 2020, travel was suspended and we hence could not rotate to a new signing certificate. So when it came to updating our shim for the CVEs, we had to revoke all previously signed kernels, grubs, shims, fwupds by their hashes. This generated a very large vendor dbx which caused lots of issues as shim exported them to a UEFI variable, and not everyone had enough space for such large variables. Sigh. We decided we want to rotate our signing key next time. This was also when upstream added SBAT metadata to shim and grub. This gives a simple versioning scheme for security updates and easy revocation using a simple EFI variable that shim writes to and reads from.

Spring 2022 CVEs We still were not ready for travel in 2021, but during BootHole we developed the SBAT mechanism, so one could revoke a grub or shim by setting a single EFI variable. We actually missed rotating the shim this cycle as a new vulnerability was reported immediately after it, and we decided to hold on to it.

2022 key rotation and the fall CVEs This caused some problems when the 2nd CVE round came, as we did not have a shim with the latest SBAT level, and neither did a lot of others, so we ended up deciding upstream to not bump the shim SBAT requirements just yet. Sigh. Anyway, in October we were meeting again for the first time at a Canonical sprint, and the shardholders got together and created three new signing keys: 2022v1, 2022v2, and 2022v3. It took us until January before they were installed into the signing service and PPAs setup to sign with them. We also submitted a shim 15.7 with the old keys revoked which came back at around the same time. Now we were in a hurry. The 22.04.2 point release was scheduled for around middle of February, and we had nothing signed with the new keys yet, but our new shim which we need for the point release (so the point release media remains bootable after the next round of CVEs), required new keys. So how do we ensure that users have kernels, grubs, and fwupd signed with the new key before we install the new shim?

upgrade ordering grub and fwupd are simple cases: For grub, we depend on the new version. We decided to backport grub 2.06 to all releases (which moved focal and bionic up from 2.04), and kept the versioning of the -signed packages the same across all releases, so we were able to simply bump the Depends for grub to specify the new minimum version. For fwupd-efi, we added Breaks. (Actually, we also had a backport of the CVEs for 2.04 based grub, and we did publish that for 20.04 signed with the old keys before backporting 2.06 to it.) Kernels are a different story: There are about 60 kernels out there. My initial idea was that we could just add Breaks for all of them. So our meta package linux-image-generic which depends on linux-image-$(uname -r)-generic, we d simply add Breaks: linux-image-generic ( 5.19.0-31) and then adjust those breaks for each series. This would have been super annoying, but ultimately I figured this would be the safest option. This however caused concern, because it could be that apt decides to remove the kernel metapackage. I explored checking the kernels at runtime and aborting if we don t have a trusted kernel in preinst. This ensures that if you try to upgrade shim without having a kernel, it would fail to install. But this ultimately has a couple of issues:
  1. It aborts the entire transaction at that point, so users will be unable to run apt upgrade until they have a recent kernel.
  2. We cannot even guarantee that a kernel would be unpacked first. So even if you got a new kernel, apt/dpkg might attempt to unpack it first and then the preinst would fail because no kernel is present yet.
Ultimately we believed the danger to be too large given that no kernels had yet been released to users. If we had kernels pushed out for 1-2 months already, this would have been a viable choice. So in the end, I ended up modifying the shim packaging to install both the latest shim and the previous one, and an update-alternatives alternative to select between the two: In it s post-installation maintainer script, shim-signed checks whether all kernels with a version greater or equal to the running one are not revoked, and if so, it will setup the latest alternative with priority 100 and the previous with a priority of 50. If one or more of those kernels was signed with a revoked key, it will swap the priorities around, so that the previous version is preferred. Now this is fairly static, and we do want you to switch to the latest shim eventually, so I also added hooks to the kernel install to trigger the shim-signed postinst script when a new kernel is being installed. It will then update the alternatives based on the current set of kernels, and if it now points to the latest shim, reinstall shim and grub to the ESP. Ultimately this means that once you install your 2nd non-revoked kernel, or you install a non-revoked kernel and then reconfigure shim or the kernel, you will get the latest shim. When you install your first non-revoked kernel, your currently booted kernel is still revoked, so it s not upgraded immediately. This has a benefit in that you will most likely have two kernels you can boot without disabling secure boot.

regressions Of course, the first version I uploaded had still some remaining hardcoded shimx64 in the scripts and so failed to install on arm64 where shimaa64 is used. And if that were not enough, I also forgot to include support for gzip compressed kernels there. Sigh, I need better testing infrastructure to be able to easily run arm64 tests as well (I only tested the actual booting there, not the scripts). shim-signed migrated to the release pocket in lunar fairly quickly, but this caused images to stop working, because the new shim was installed into images, but no kernel was available yet, so we had to demote it to proposed and block migration. Despite all the work done for end users, we need to be careful to roll this out for image building.

another grub update for OOM issues. We had two grubs to release: First there was the security update for the recent set of CVEs, then there also was an OOM issue for large initrds which was blocking critical OEM work. We fixed the OOM issue by cherry-picking all 2.12 memory management patches, as well as the red hat patches to the loader we take from there. This ended up a fairly large patch set and I was hesitant to tie the security update to that, so I ended up pushing the security update everywhere first, and then pushed the OOM fixes this week. With the OOM patches, you should be able to boot initrds of between 400M and 1GB, it also depends on the memory layout of your machine and your screen resolution and background images. So OEM team had success testing 400MB irl, and I tested up to I think it was 1.2GB in qemu, I ran out of FAT space then and stopped going higher :D

other features in this round
  • Intel TDX support in grub and shim
  • Kernels are allocated as CODE now not DATA as per the upstream mm changes, might fix boot on X13s

am I using this yet? The new signing keys are used in:
  • shim-signed 1.54 on 22.10+, 1.51.3 on 22.04, 1.40.9 on 20.04, 1.37~18.04.13 on 18.04
  • grub2-signed 1.187.2~ or newer (binary packages grub-efi-amd64-signed or grub-efi-arm64-signed), 1.192 on 23.04.
  • fwupd-signed 1.51~ or newer
  • various linux updates. Check apt changelog linux-image-unsigned-$(uname -r) to see if Revoke & rotate to new signing key (LP: #2002812) is mentioned in there to see if it signed with the new key.
If you were able to install shim-signed, your grub and fwupd-efi will have the correct version as that is ensured by packaging. However your shim may still point to the old one. To check which shim will be used by grub-install, you can check the status of the shimx64.efi.signed or (on arm64) shimaa64.efi.signed alternative. The best link needs to point to the file ending in latest:
$ update-alternatives --display shimx64.efi.signed
shimx64.efi.signed - auto mode
  link best version is /usr/lib/shim/shimx64.efi.signed.latest
  link currently points to /usr/lib/shim/shimx64.efi.signed.latest
  link shimx64.efi.signed is /usr/lib/shim/shimx64.efi.signed
/usr/lib/shim/shimx64.efi.signed.latest - priority 100
/usr/lib/shim/shimx64.efi.signed.previous - priority 50
If it does not, but you have installed a new kernel compatible with the new shim, you can switch immediately to the new shim after rebooting into the kernel by running dpkg-reconfigure shim-signed. You ll see in the output if the shim was updated, or you can check the output of update-alternatives as you did above after the reconfiguration has finished. For the out of memory issues in grub, you need grub2-signed 1.187.3~ (same binaries as above).

how do I test this (while it s in proposed)?
  1. upgrade your kernel to proposed and reboot into that
  2. upgrade your grub-efi-amd64-signed, shim-signed, fwupd-signed to proposed.
If you already upgraded your shim before your kernel, don t worry:
  1. upgrade your kernel and reboot
  2. run dpkg-reconfigure shim-signed
And you ll be all good to go.

deep dive: uploading signed boot assets to Ubuntu For each signed boot asset, we build one version in the latest stable release and the development release. We then binary copy the built binaries from the latest stable release to older stable releases. This process ensures two things: We know the next stable release is able to build the assets and we also minimize the number of signed assets. OK, I lied. For shim, we actually do not build in the development release but copy the binaries upward from the latest stable, as each shim needs to go through external signing. The entire workflow looks something like this:
  1. Upload the unsigned package to one of the following build PPAs:
  2. Upload the signed package to the same PPA
  3. For stable release uploads:
    • Copy the unsigned package back across all stable releases in the PPA
    • Upload the signed package for stable releases to the same PPA with ~<release>.1 appended to the version
  4. Submit a request to canonical-signing-jobs to sign the uploads. The signing job helper copies the binary -unsigned packages to the primary-2022v1 PPA where they are signed, creating a signing tarball, then it copies the source package for the -signed package to the same PPA which then downloads the signing tarball during build and places the signed assets into the -signed deb. Resulting binaries will be placed into the proposed PPA: https://launchpad.net/~ubuntu-uefi-team/+archive/ubuntu/proposed
  5. Review the binaries themselves
  6. Unembargo and binary copy the binaries from the proposed PPA to the proposed-public PPA: https://launchpad.net/~ubuntu-uefi-team/+archive/ubuntu/proposed-public. This step is not strictly necessary, but it enables tools like sru-review to work, as they cannot access the packages from the normal private proposed PPA.
  7. Binary copy from proposed-public to the proposed queue(s) in the primary archive
Lots of steps!

WIP As of writing, only the grub updates have been released, other updates are still being verified in proposed. An update for fwupd in bionic will be issued at a later point, removing the EFI bits from the fwupd 1.2 packaging and using the separate fwupd-efi project instead like later release series.

29 January 2023

Gunnar Wolf: miniDebConf Tamil Nadu 2023

Greetings from Viluppuram, Tamil Nadu, South India! As a preparation and warm-up for DebConf in September, the Debian people in India have organized a miniDebConf. Well, I don t want to be unfair to them They have been regularly organizing miniDebConfs for over a decade, and while most of the attendees are students local to this state in South India (the very tip of the country; Tamil Nadu is the Eastern side, and Kerala, where Kochi is and DebConf will be held, is the Western side), I have talked with attendees from very different regions of this country. This miniDebConf is somewhat similar to similarly-scoped events I have attended in Latin America: It is mostly an outreach conference, but it s also a great opportunity for DDs in India to meet in the famous hallway track. India is incredibly multicultural. Today at the hotel, I was somewhat surprised to see people from Kerala trying to read a text written in Tamil: Not only the languages are different, but the writing systems also are. From what I read, Tamil script is a bit simpler to Kerala s Mayalayam, although they come from similar roots. Of course, my school of thought is that, whenever you visit a city, culture or country that differs from the place you were born, a fundamental component to explore and to remember is Food! And one of the things I most looked forward for this trip was that precisely. I arrived to the Chennai Airport (MAA) 8:15 local time yesterday morning, so I am far from an expert but I have been given (and most happily received) three times biryani (pictured in the photo by this paragraph). It is delicious, although I cannot yet describe the borders of what should or should not be considered proper biryani): The base dish is rice, and you go mixing it with different sauces or foods. What managed to surprise us foreigners is, strangely, well known for us all: there is no spoon. No, the food is not pushed to your mouth using metal or wooden utensils. Not even using a tortilla as back home, or by breaking bits of the injera that serves also as a dish, as in Ethiopia. Sure, there is naan, but it is completely optional, and would be a bit too much for as big a big dish as what we have got. Biryani is eaten With the tools natural to us primates: the fingers. We have learnt some differnt techniques but so far, I am still using the base technique (thumb-finger-middle). I m closing the report with the photo of the closing of the conference as it happens. And I will, of course, share our adventures as they unfold in the next couple of days. Because Well, we finished with the conference-y part of the trip, but we have a full week of (pre-)DebConf work ahead of us!

26 January 2023

Louis-Philippe V ronneau: Montreal Subway Foot Traffic Data, 2022 edition

For the fourth year in a row, I've asked Soci t de Transport de Montr al, Montreal's transit agency, for the foot traffic data of Montreal's subway. By clicking on a subway station, you'll be redirected to a graph of the station's foot traffic. Licences

22 January 2023

Dirk Eddelbuettel: BH 1.81.0-1 oon CRAN: New Upstream, New Library, sprintf Change

Boost Boost is a very large and comprehensive set of (peer-reviewed) libraries for the C++ programming language, containing well over one hundred individual libraries. The BH package provides a sizeable subset of header-only libraries for (easier, no linking required) use by R. It is fairly widely used: the (partial) CRAN mirror logs (aggregated from the cloud mirrors) show over 32.6 million package downloads. Version 1.81.0 of Boost was released in December following the regular Boost release schedule of April, August and December releases. As the commits and changelog show, we packaged it almost immediately and started testing following our annual update cycle which strives to balance being close enough to upstream and not stressing CRAN and the user base too much. The reverse depends check revealed about a handful of packages requiring changes or adjustments which is a pretty good outcome given the over three hundred direct reverse dependencies. So we opened issue #88 to coordinate the issue over the winter break during which CRAN also closes (just as we did before), and also send a wider PSA tweet as a heads-up. Our sincere thanks to the two packages that already updated, and the four that likely will soon. Our thanks also to CRAN for reviewing the package impact over the last few days since I uploaded the package earlier this week. There are a number of changes I have to make each time in BH, and it is worth mentioning them. Because CRAN cares about backwards compatibility and the ability to be used on minimal or older systems, we still adjust the filenames of a few files to fit a jurassic constraints of just over a 100 characters per filepath present in some long-outdated versions of tar. Not a big deal. We also, and that is more controversial, silence a number of #pragma diagnostic messages for g++ and clang++ because CRAN insists on it. I have no choice in that matter. Next, and hopefully this time only, we also found and replaced a few remaining sprintf uses and replaced them with snprintf. Many of the Boost libraries did that, so I hope by the next upgrade for Boost 1.84.0 next winter this will be fully taken care of. Lastly, and also only this time, we silenced a warning about Boost switching to C++14 in the next release 1.82.0 in April. This may matter for a number of packages having a hard-wired selection of C++11 as their C++ language standard. Luckily our compilers are good enough for C++14 so worst case I will have to nudge a few packages next December. This release adds one new library for url processing which struck us as potentially quite useful. The more detailed NEWS log follows.

Changes in version 1.81.0-1 (2023-01-17)
  • Upgrade to Boost 1.81.0 (#87)
  • Added url (new in 1.81.0)
  • Converted remaining sprintf to snprintf (#90 fixing #89)
  • Comment-out gcc warning messages in three files

Via my CRANberries, there is a diffstat report relative to the previous release. Comments and suggestions about BH are welcome via the issue tracker at the GitHub repo. If you like this or other open-source work I do, you can now sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Next.

Previous.